A First-Order Primal-Dual Method for Nonconvex Constrained Optimization Based on the Augmented Lagrangian

Published Online:https://doi.org/10.1287/moor.2022.1350

References

  • [1] Apicella A, Donnarumma F, Isgró F, Prevete R (2021) A survey on modern trainable activation functions. Neural Networks 138:14–32.CrossrefGoogle Scholar
  • [2] Attouch H, Bolte J, Svaiter BF (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Programming 137(1–2):91–129.CrossrefGoogle Scholar
  • [3] Bach F, Jenatton R, Mairal J, Obozinski G (2012). Optimization with sparsity-inducing penalties. Foundations Trends® Machine Learn. 4(1):1–106.Google Scholar
  • [4] Bai S, Li M, Lu C, Zhu D, Deng S (2022) The equivalence of three types of error bounds for weakly and approximately convex functions. J. Optim. Theory Appl. 194(1):220–245.CrossrefGoogle Scholar
  • [5] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1):459–494.CrossrefGoogle Scholar
  • [6] Bolte J, Sabach S, Teboulle M (2018) Nonconvex Lagrangian-based optimization: Monitoring schemes and global convergence. Math. Oper. Res. 43(4):1210–1232.LinkGoogle Scholar
  • [7] Bolte J, Sabach S, Teboulle M, Vaisbourd Y (2018) First-order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3):2131–2151.CrossrefGoogle Scholar
  • [8] Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends® Machine Learn. 3(1):1–122.CrossrefGoogle Scholar
  • [9] Bradley PS, Mangasarian OL, Street WN (1998) Feature selection via mathematical programming. INFORMS J. Comput. 10(2):209–217.LinkGoogle Scholar
  • [10] Candes EJ, Wakin MB (2008) An introduction to compressive sampling. IEEE Signal Processing Magazine 25(2):21–30.CrossrefGoogle Scholar
  • [11] Candes EJ, Wakin MB, Boyd SP (2008) Enhancing sparsity by reweighted ℓ1 minimization. J. Fourier Anal. Appl. 14(5–6):877–905.CrossrefGoogle Scholar
  • [12] Chen G, Teboulle M (1994) A proximal-based decomposition method for convex minimization problems. Math. Programming 64(1–3):81–101.CrossrefGoogle Scholar
  • [13] Cohen G (1980) Auxiliary problem principle and decomposition of optimization problems. J. Optim. Theory Appl. 32(3):277–305.CrossrefGoogle Scholar
  • [14] Cohen G, Zhu D (1984) Decomposition and coordination methods in large scale optimization problems: The nondifferentiable case and the use of augmented lagrangians. Adv. Large Scale Systems 1:203–266.Google Scholar
  • [15] Cohen E, Hallak N, Teboulle M (2021) A dynamic alternating direction of multipliers for nonconvex minimization with nonlinear functional equality constraints. J. Optim. Theory Appl. 193(1):324–353.Google Scholar
  • [16] Deng W, Lai MJ, Peng Z, Yin WT (2017) Parallel multi-block ADMM with o(1/k) convergence. J. Sci. Comput. 71(2):712–736.CrossrefGoogle Scholar
  • [17] Gao X, Zhang S (2017) First-order algorithms for convex optimization with nonseparate objective and coupled constraints. J. Oper. Res. Soc. China 5:131–159.CrossrefGoogle Scholar
  • [18] Hong M, Luo ZQ, Razaviyayn M (2016) Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1):337–364.CrossrefGoogle Scholar
  • [19] Huber PJ (1973) Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Statist. 1(5):799–821.CrossrefGoogle Scholar
  • [20] Jiang B, Lin T, Ma S, Zhang S (2019) Structured nonconvex and nonsmooth optimization: Algorithms and iteration complexity analysis. Comput. Optim. Appl. 72:115–157.CrossrefGoogle Scholar
  • [21] Li G, Pong TK (2018) Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods. Foundations Comput. Math. 18(5):1199–1232.CrossrefGoogle Scholar
  • [22] Liu Q, Shen X, Gu Y (2019) Linearized ADMM for nonconvex nonsmooth optimization with convergence analysis. IEEE Access 7:76131–76144.CrossrefGoogle Scholar
  • [23] Ma S, Aybat NS (2018) Efficient optimization algorithms for robust principal component analysis and its variants. Proc. IEEE 106(8):1411–1426.CrossrefGoogle Scholar
  • [24] Mei S, Bai Y, Montanari A (2018) The landscape of empirical risk for nonconvex losses. Ann. Statist. 46(6A):2747–2774.CrossrefGoogle Scholar
  • [25] Mordukhovich BS (2006) Variational Analysis and Generalized Differentiation I: Basic Theory, vol. 330 (Springer Science & Business Media, Boston).CrossrefGoogle Scholar
  • [26] Nguyen T, Sanner S (2013) Algorithms for direct 0-1 loss optimization in binary classification. Proc. Internat. Conf. on Machine Learn. (PMLR), 1085–1093.Google Scholar
  • [27] Qian S, Liu H, Liu C, Wu S, San Wong H (2018) Adaptive activation functions in convolutional neural networks. Neurocomputing 272:204–212.CrossrefGoogle Scholar
  • [28] Qin Z, Fan J, Liu Y, Gao Y, Li GY (2018) Sparse representation for wireless communications: A compressive sensing approach. IEEE Signal Processing Magazine 35(3):40–58.CrossrefGoogle Scholar
  • [29] Robinson SM (1981) Some continuity properties of polyhedral multifunctions. Mathematical Programming at Oberwolfach (Springer, Berlin), 206–214.CrossrefGoogle Scholar
  • [30] Rockafellar RT, Wets RJB (2009) Variational Analysis, vol. 317 (Springer Science & Business Media, Boston).Google Scholar
  • [31] Shi Y, Zhang J, Chen W, Letaief KB (2018) Generalized sparse and low-rank optimization for ultra-dense networks. IEEE Comm. Magazine 56(6):42–48.CrossrefGoogle Scholar
  • [32] Teboulle M (2018) A simplified view of first order methods for optimization. Math. Programming 170(1):67–96.CrossrefGoogle Scholar
  • [33] Tseng P, Yun S (2009) A coordinate gradient descent method for nonsmooth separable minimization. Math. Programming 117(1–2):387–423.CrossrefGoogle Scholar
  • [34] Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans. Neural Networks 10(5):988–999.CrossrefGoogle Scholar
  • [35] Wainwright MJ (2014) Structured regularizers for high-dimensional problems: Statistical and computational issues. Annu. Rev. Statist. Appl. 1:233–253.CrossrefGoogle Scholar
  • [36] Wang Y, Yin W, Zeng J (2019) Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78:29–63.CrossrefGoogle Scholar
  • [37] Zhang J, Luo ZQ (2020) A proximal alternating direction method of multiplier for linearly constrained nonconvex minimization. SIAM J. Optim. 30(3):2272–2302.CrossrefGoogle Scholar
  • [38] Zhao L, Zhu DL (2022) On iteration complexity of a first-order primal-dual method for nonlinear convex cone programming. J. Oper. Res. Soc. China 10(1):53–87.CrossrefGoogle Scholar
  • [39] Zheng XY, Ng KF (2015) Hölder stable minimizers, tilt stability, and Hölder metric regularity of subdifferentials. SIAM J. Optim. 25(1):416–438.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.