Zero-One Composite Optimization: Lyapunov Exact Penalty and a Globally Convergent Inexact Augmented Lagrangian Method

Published Online:https://doi.org/10.1287/moor.2021.0320

References

  • [1] Beck A (2017) First-Order Methods in Optimization, MOS-SIAM Series on Optimization (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [2] Bertsekas DP (1996) Constrained Optimization and Lagrange Multiplier Methods, Athena Scientific Optimization and Computation Series (Athena Scientific, Nashua, NH).Google Scholar
  • [3] Birgin EG, Martínez JM (2014) Practical Augmented Lagrangian Methods for Constrained Optimization (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [4] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1):459–494.CrossrefGoogle Scholar
  • [5] Bolte J, Sabach S, Teboulle M (2018) Nonconvex Lagrangian-based optimization: Monitoring schemes and global convergence. Math. Oper. Res. 43(4):1210–1232.LinkGoogle Scholar
  • [6] Boţ RI, Nguyen DK (2020) The proximal alternating direction method of multipliers in the nonconvex setting: Convergence analysis and rates. Math. Oper. Res. 45(2):682–712.LinkGoogle Scholar
  • [7] Boţ RI, Csetnek ER, Nguyen DK (2019) A proximal minimization algorithm for structured nonconvex and nonsmooth problems. SIAM J. Optim. 29(2):1300–1328.CrossrefGoogle Scholar
  • [8] Boufounos PT, Baraniuk RG (2008) 1-bit compressive sensing. Chiang M, Erkip E, Bennatan A, Kutyniok G, eds. 2008 42nd Annual Conf. Inform. Sci. Systems (IEEE, Piscataway, NJ), 16–21.Google Scholar
  • [9] Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recognition 37(9):1757–1771.CrossrefGoogle Scholar
  • [10] Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers (now Foundations and Trends, Boston).Google Scholar
  • [11] Chambolle A, Pock T (2011) A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision 40(1):120–145.CrossrefGoogle Scholar
  • [12] Chen X, Guo L, Lu Z, Ye JJ (2017) An augmented Lagrangian method for non-Lipschitz nonconvex programming. SIAM J. Numerical Anal. 55(1):168–193.CrossrefGoogle Scholar
  • [13] Chen C, Liu YJ, Sun D, Toh KC (2016) A semismooth Newton-CG based dual PPA for matrix spectral norm approximation problems. Math. Programming 155(1–2):435–470.CrossrefGoogle Scholar
  • [14] Cordova M, de Oliveira W, Sagastizábal C (2021) Revisiting augmented Lagrangian duals. Math. Programming 196(1):235–277.CrossrefGoogle Scholar
  • [15] Cortes C, Vapnik V (1995) Support-vector networks. Machine Learn. 20(3):273–297.CrossrefGoogle Scholar
  • [16] De Marchi A, Jia X, Kanzow C, Mehlitz P (2023) Constrained composite optimization and augmented Lagrangian methods. Math. Programming 201(2):863–896.CrossrefGoogle Scholar
  • [17] Fan Y, Han F, Li W, Zhou XH (2020) On rank estimators in increasing dimensions. J. Econometrics 214(2):379–412.CrossrefGoogle Scholar
  • [18] Friedrich J, Zhou P, Paninski L (2017) Fast online deconvolution of calcium imaging data. PLoS Comput. Biology 13(3):e1005423.CrossrefGoogle Scholar
  • [19] Gao W, Goldfarb D, Curtis FE (2020) ADMM for multiaffine constrained optimization. Optim. Methods Software 35(2):257–303.CrossrefGoogle Scholar
  • [20] Han AK (1987) Non-parametric analysis of a generalized regression model: The maximum rank correlation estimator. J. Econometrics 35(2–3):303–316.CrossrefGoogle Scholar
  • [21] Han D, Sun D, Zhang L (2018) Linear rate convergence of the alternating direction method of multipliers for convex composite programming. Math. Oper. Res. 43(2):622–637.LinkGoogle Scholar
  • [22] Hong M, Luo ZQ, Razaviyayn M (2016) Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1):337–364.CrossrefGoogle Scholar
  • [23] Huang J, Jiao Y, Lu X, Zhu L (2018) Robust decoding from 1-bit compressive sampling with ordinary and regularized least squares. SIAM J. Sci. Comput. 40(4):A2062–A2086.CrossrefGoogle Scholar
  • [24] Jayadeva, Khemchandani R, Chandra S (2007) Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Machine Intelligence 29(5):905–910.CrossrefGoogle Scholar
  • [25] Jia Z, Huang J, Wu Z (2021) An incremental aggregated proximal ADMM for linearly constrained nonconvex optimization with application to sparse logistic regression problems. J. Comput. Appl. Math. 390:113384.CrossrefGoogle Scholar
  • [26] Jia X, Kanzow C, Mehlitz P, Wachsmuth G (2023) An augmented Lagrangian method for optimization problems with structured geometric constraints. Math. Programming 199(2):1365–1415.CrossrefGoogle Scholar
  • [27] Kanzow C, Qi HD (1999) A QP-free constrained Newton-type method for variational inequality problems. Math. Programming 85(1):81–106.CrossrefGoogle Scholar
  • [28] Kanzow C, Raharja AB, Schwartz A (2021) An augmented Lagrangian method for cardinality-constrained optimization problems. J. Optim. Theory Appl. 83(3):793–813.CrossrefGoogle Scholar
  • [29] Lai MJ, Xu Y, Yin W (2013) Improved iteratively reweighted least squares for unconstrained smoothed lq minimization. SIAM J. Numerical Anal. 51(2):927–957.CrossrefGoogle Scholar
  • [30] Lee YJ, Mangasarian OL (2001) SSVM: A smooth support vector machine for classification. Comput. Optim. Appl. 20(1):5–22.CrossrefGoogle Scholar
  • [31] Li G, Pong TK (2015) Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4):2434–2460.CrossrefGoogle Scholar
  • [32] Li X, Sun D, Toh KC (2018) QSDPNAL: A two-phase augmented Lagrangian method for convex quadratic semidefinite programming. Math. Programming Comput. 10(4):703–743.CrossrefGoogle Scholar
  • [33] Li Z, Wang Q, Li R (2021) Central limit theorem for linear spectral statistics of large dimensional Kendall’s rank correlation matrices and its applications. Ann. Statist. 49(3):1569–1593.CrossrefGoogle Scholar
  • [34] Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans. Neural Networks Learn. Systems 10(5):1032–1037.CrossrefGoogle Scholar
  • [35] Mordukhovich BS, Nam NM (2013) An Easy Path to Convex Analysis and Applications, Synthesis Lectures on Mathematics and Statistics (Morgan & Claypool Publishers, Kentfield, CA).Google Scholar
  • [36] Nocedal J, Wright S (2006) Numerical Optimization, Springer Series in Operations Research and Financial Engineering (Springer, New York).Google Scholar
  • [37] Noschese S, Pasquini L, Reichel L (2013) Tridiagonal Toeplitz matrices: Properties and novel applications. Numerical Linear Algebra Appl. 20(2):302–326.CrossrefGoogle Scholar
  • [38] Ochs P, Fadili J, Brox T (2019) Non-smooth non-convex Bregman minimization: Unification and new algorithms. J. Optim. Theory Appl. 181(1):244–278.CrossrefGoogle Scholar
  • [39] Rockafellar RT (1976) Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2):97–116.LinkGoogle Scholar
  • [40] Rockafellar RT (2023) Convergence of augmented Lagrangian methods in extensions beyond nonlinear programming. Math. Programming 199(1–2):375–420.CrossrefGoogle Scholar
  • [41] Rockafellar RT, Wets RJB (1998) Variational Analysis, Fundamental Principles of Mathematical Sciences (Springer, Berlin).CrossrefGoogle Scholar
  • [42] Shao YH, Zhang CH, Wang XB, Deng NY (2011) Improvements on twin support vector machines. IEEE Trans. Neural Networks Learn. Systems 22(6):962–968.CrossrefGoogle Scholar
  • [43] Shefi R, Teboulle M (2014) Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optim. 24(1):269–297.CrossrefGoogle Scholar
  • [44] Song J, Li J, Yao Z, Ma K, Bao C (2020) Zero norm based analysis model for image smoothing and reconstruction. Inverse Problems 36(11):115009.CrossrefGoogle Scholar
  • [45] Teboulle M, Vaisbourd Y (2020) Novel proximal gradient methods for nonnegative matrix factorization with sparsity constraints. SIAM J. Imaging Sci. 13(1):381–421.CrossrefGoogle Scholar
  • [46] Teng Y, Yang L, Song X, Yu B (2020) An augmented Lagrangian proximal alternating method for sparse discrete optimization problems. Numerical Algorithms 83(3):833–866.CrossrefGoogle Scholar
  • [47] Themelis A, Stella L, Patrinos P (2018) Forward-backward envelope for the sum of two nonconvex functions: Further properties and nonmonotone linesearch algorithms. SIAM J. Optim. 28(3):2274–2303.CrossrefGoogle Scholar
  • [48] Vogelstein JT, Packer AM, Machado TA, Sippy T, Babadi B, Yuste R, Paninski L (2010) Fast nonnegative deconvolution for spike train inference from population calcium imaging. J. Neurophysiology 104(6):3691–3704.CrossrefGoogle Scholar
  • [49] Wang F, Cao W, Xu Z (2018) Convergence of multi-block Bregman ADMM for nonconvex composite problems. Sci. China Inform. Sci. 61(12):1–12.CrossrefGoogle Scholar
  • [50] Wang Y, Yin W, Zeng J (2019) Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1):29–63.CrossrefGoogle Scholar
  • [51] Wang H, Shao Y, Zhou S, Zhang C, Xiu N (2021) Support vector machine classifier via L0/1 soft-margin loss. IEEE Trans. Pattern Anal. Machine Intelligence 44(10):7253–7265.CrossrefGoogle Scholar
  • [52] Yang L, Sun D, Toh KC (2015) SDPNAL+: A majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Programming Comput. 7(3):331–366.CrossrefGoogle Scholar
  • [53] Yashtini M (2021) Convergence analysis of a proximal linearized ADMM algorithm for nonconvex nonsmooth optimization. Preprint, submitted July 3, https://arxiv.org/abs/2009.05361.Google Scholar
  • [54] Zhang ML, Zhou ZH (2013) A review on multi-label learning algorithms. IEEE Trans. Knowledge Data Engrg. 26(8):1819–1837.CrossrefGoogle Scholar
  • [55] Zhang X, Burger M, Osher S (2011) A unified primal-dual algorithm framework based on Bregman iteration. J. Sci. Comput. 46(1):20–46.CrossrefGoogle Scholar
  • [56] Zhou S, Pan L, Xiu N, Qi HD (2021) Quadratic convergence of smoothing Newton’s method for 0/1 loss optimization. SIAM J. Optim. 31(4):3184–3211.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.