A First-Order Primal-Dual Method for Nonconvex Constrained Optimization Based on the Augmented Lagrangian

Daoli Zhu
Daoli Zhu
[email protected]
Antai College of Economics and Management and Sino-US Global Logistics Institute, Shanghai Jiao Tong University, Shanghai 200030, China;School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen 518172, China;
Search for more papers by this author
,
Lei Zhao
Lei Zhao
[email protected]
Institute of Translational Medicine and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, China;
Search for more papers by this author
,
Shuzhong Zhang
Corresponding Author
Shuzhong Zhang
[email protected]
https://orcid.org/0000-0001-5838-1676
Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, Minnesota 55455
Search for more papers by this author

Antai College of Economics and Management and Sino-US Global Logistics Institute, Shanghai Jiao Tong University, Shanghai 200030, China;School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen 518172, China;

Search for more papers by this author

Lei Zhao

[email protected]

Institute of Translational Medicine and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, China;

Search for more papers by this author

Shuzhong Zhang

Corresponding Author

Shuzhong Zhang

[email protected]

https://orcid.org/0000-0001-5838-1676

Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, Minnesota 55455

Search for more papers by this author

Published Online:20 Mar 2023https://doi.org/10.1287/moor.2022.1350

References

[1] Apicella A, Donnarumma F, Isgró F, Prevete R (2021) A survey on modern trainable activation functions. Neural Networks 138:14–32.Crossref, Google Scholar
[2] Attouch H, Bolte J, Svaiter BF (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Programming 137(1–2):91–129.Crossref, Google Scholar
[3] Bach F, Jenatton R, Mairal J, Obozinski G (2012). Optimization with sparsity-inducing penalties. Foundations Trends® Machine Learn. 4(1):1–106.Google Scholar
[4] Bai S, Li M, Lu C, Zhu D, Deng S (2022) The equivalence of three types of error bounds for weakly and approximately convex functions. J. Optim. Theory Appl. 194(1):220–245.Crossref, Google Scholar
[5] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1):459–494.Crossref, Google Scholar
[6] Bolte J, Sabach S, Teboulle M (2018) Nonconvex Lagrangian-based optimization: Monitoring schemes and global convergence. Math. Oper. Res. 43(4):1210–1232.Link, Google Scholar
[7] Bolte J, Sabach S, Teboulle M, Vaisbourd Y (2018) First-order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3):2131–2151.Crossref, Google Scholar
[8] Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends® Machine Learn. 3(1):1–122.Crossref, Google Scholar
[9] Bradley PS, Mangasarian OL, Street WN (1998) Feature selection via mathematical programming. INFORMS J. Comput. 10(2):209–217.Link, Google Scholar
[10] Candes EJ, Wakin MB (2008) An introduction to compressive sampling. IEEE Signal Processing Magazine 25(2):21–30.Crossref, Google Scholar
[11] Candes EJ, Wakin MB, Boyd SP (2008) Enhancing sparsity by reweighted ℓ1 minimization. J. Fourier Anal. Appl. 14(5–6):877–905.Crossref, Google Scholar
[12] Chen G, Teboulle M (1994) A proximal-based decomposition method for convex minimization problems. Math. Programming 64(1–3):81–101.Crossref, Google Scholar
[13] Cohen G (1980) Auxiliary problem principle and decomposition of optimization problems. J. Optim. Theory Appl. 32(3):277–305.Crossref, Google Scholar
[14] Cohen G, Zhu D (1984) Decomposition and coordination methods in large scale optimization problems: The nondifferentiable case and the use of augmented lagrangians. Adv. Large Scale Systems 1:203–266.Google Scholar
[15] Cohen E, Hallak N, Teboulle M (2021) A dynamic alternating direction of multipliers for nonconvex minimization with nonlinear functional equality constraints. J. Optim. Theory Appl. 193(1):324–353.Google Scholar
[16] Deng W, Lai MJ, Peng Z, Yin WT (2017) Parallel multi-block ADMM with o(1/k) convergence. J. Sci. Comput. 71(2):712–736.Crossref, Google Scholar
[17] Gao X, Zhang S (2017) First-order algorithms for convex optimization with nonseparate objective and coupled constraints. J. Oper. Res. Soc. China 5:131–159.Crossref, Google Scholar
[18] Hong M, Luo ZQ, Razaviyayn M (2016) Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1):337–364.Crossref, Google Scholar
[19] Huber PJ (1973) Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Statist. 1(5):799–821.Crossref, Google Scholar
[20] Jiang B, Lin T, Ma S, Zhang S (2019) Structured nonconvex and nonsmooth optimization: Algorithms and iteration complexity analysis. Comput. Optim. Appl. 72:115–157.Crossref, Google Scholar
[21] Li G, Pong TK (2018) Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods. Foundations Comput. Math. 18(5):1199–1232.Crossref, Google Scholar
[22] Liu Q, Shen X, Gu Y (2019) Linearized ADMM for nonconvex nonsmooth optimization with convergence analysis. IEEE Access 7:76131–76144.Crossref, Google Scholar
[23] Ma S, Aybat NS (2018) Efficient optimization algorithms for robust principal component analysis and its variants. Proc. IEEE 106(8):1411–1426.Crossref, Google Scholar
[24] Mei S, Bai Y, Montanari A (2018) The landscape of empirical risk for nonconvex losses. Ann. Statist. 46(6A):2747–2774.Crossref, Google Scholar
[25] Mordukhovich BS (2006) Variational Analysis and Generalized Differentiation I: Basic Theory, vol. 330 (Springer Science & Business Media, Boston).Crossref, Google Scholar
[26] Nguyen T, Sanner S (2013) Algorithms for direct 0-1 loss optimization in binary classification. Proc. Internat. Conf. on Machine Learn. (PMLR), 1085–1093.Google Scholar
[27] Qian S, Liu H, Liu C, Wu S, San Wong H (2018) Adaptive activation functions in convolutional neural networks. Neurocomputing 272:204–212.Crossref, Google Scholar
[28] Qin Z, Fan J, Liu Y, Gao Y, Li GY (2018) Sparse representation for wireless communications: A compressive sensing approach. IEEE Signal Processing Magazine 35(3):40–58.Crossref, Google Scholar
[29] Robinson SM (1981) Some continuity properties of polyhedral multifunctions. Mathematical Programming at Oberwolfach (Springer, Berlin), 206–214.Crossref, Google Scholar
[30] Rockafellar RT, Wets RJB (2009) Variational Analysis, vol. 317 (Springer Science & Business Media, Boston).Google Scholar
[31] Shi Y, Zhang J, Chen W, Letaief KB (2018) Generalized sparse and low-rank optimization for ultra-dense networks. IEEE Comm. Magazine 56(6):42–48.Crossref, Google Scholar
[32] Teboulle M (2018) A simplified view of first order methods for optimization. Math. Programming 170(1):67–96.Crossref, Google Scholar
[33] Tseng P, Yun S (2009) A coordinate gradient descent method for nonsmooth separable minimization. Math. Programming 117(1–2):387–423.Crossref, Google Scholar
[34] Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans. Neural Networks 10(5):988–999.Crossref, Google Scholar
[35] Wainwright MJ (2014) Structured regularizers for high-dimensional problems: Statistical and computational issues. Annu. Rev. Statist. Appl. 1:233–253.Crossref, Google Scholar
[36] Wang Y, Yin W, Zeng J (2019) Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78:29–63.Crossref, Google Scholar
[37] Zhang J, Luo ZQ (2020) A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM J. Optim. 30(3):2272–2302.Crossref, Google Scholar
[38] Zhao L, Zhu DL (2022) On iteration complexity of a first-order primal-dual method for nonlinear convex cone programming. J. Oper. Res. Soc. China 10(1):53–87.Crossref, Google Scholar
[39] Zheng XY, Ng KF (2015) Hölder stable minimizers, tilt stability, and Hölder metric regularity of subdifferentials. SIAM J. Optim. 25(1):416–438.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 49, Issue 1

February 2024

Pages 1-651, C2

Article Information

Metrics

Information

Received:August 31, 2021
Accepted:December 08, 2022
Published Online:March 20, 2023

Cite as

Daoli Zhu, Lei Zhao, Shuzhong Zhang (2023) A First-Order Primal-Dual Method for Nonconvex Constrained Optimization Based on the Augmented Lagrangian. Mathematics of Operations Research 49(1):125-150.

https://doi.org/10.1287/moor.2022.1350

Keywords

Acknowledgments

The authors thank the two anonymous referees whose comments significantly helped improve this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

A First-Order Primal-Dual Method for Nonconvex Constrained Optimization Based on the Augmented Lagrangian

References

Volume 49, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News