A New Homotopy Proximal Variable-Metric Framework for Composite Convex Minimization

Published Online:https://doi.org/10.1287/moor.2021.1138

References

  • [1] Banerjee O, El Ghaoui L, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Machine Learn. Res. 9(15):485–516.Google Scholar
  • [2] Bauschke HH, Combettes P (2017) Convex Analysis and Monotone Operators Theory in HilbertSspaces, 2nd ed. (Springer-Verlag, New York).CrossrefGoogle Scholar
  • [3] Beck A, Teboulle M (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Processing 18(11):2419–2434.CrossrefGoogle Scholar
  • [4] Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding agorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.CrossrefGoogle Scholar
  • [5] Beck A, Teboulle M (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.CrossrefGoogle Scholar
  • [6] Becker S, Fadili M (2012) A quasi-Newton proximal splitting method. Adv. Neural Inform. Processing Systems Foundation, vol. 25, 1–8.Google Scholar
  • [7] Becker S, Candès EJ, Grant M (2011) Templates for convex cone problems with applications to sparse signal recovery. Math. Programming. Comput. 3(3):165–218.CrossrefGoogle Scholar
  • [8] Bertsekas D (2011) Incremental gradient, subgradient, and proximal methods for convex optimization: A survey. Sra S, Nowozin S, Wright SJ, eds. Optimization for Machine Learning, 1–38.Google Scholar
  • [9] Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends Machine Learn. 3(1):1–122.CrossrefGoogle Scholar
  • [10] Chambolle A, Pock T (2011) A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision 40(1):120–145.CrossrefGoogle Scholar
  • [11] Dalalyan AS, Hebiri M, Meziani K, Salmon J (2013) Learning heteroscedastic models by convex programming under group sparsity. Proc. Internat. Conf. Machine Learn. (PMLR, Atlanta), 379–387.Google Scholar
  • [12] Davis D (2015) Convergence rate analysis of the forward-Douglas-Rachford splitting scheme. SIAM J. Optim. 25(3):1760–1786.CrossrefGoogle Scholar
  • [13] Defazio A, Bach F, Lacoste-Julien S (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Adv. Neural Inform. Processing Systems, 1646–1654.Google Scholar
  • [14] Deuflhard P (2006) Newton Methods for Nonlinear Problems—Affine Invariance and Adaptative Algorithms, Springer Series in Computational Mathematics, vol. 35, 2nd ed. (Springer Science & Business Media).Google Scholar
  • [15] Devolder O, Glineur F, Nesterov Y (2014) First-order methods of smooth convex optimization with inexact oracle. Math. Programming 146(1–2):37–75.CrossrefGoogle Scholar
  • [16] Esser JE (2010) Primal-dual algorithm for convex models and applications to image restoration, registration and nonlocal inpainting. Unpublished PhD thesis, University of California, Los Angeles.Google Scholar
  • [17] Fercoq O, Richtárik P (2015) Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.CrossrefGoogle Scholar
  • [18] Friedlander M, Goh G (2017) Efficient evaluation of scaled proximal operators. Electronic Trans. Numerical Anal. 46:1–22.Google Scholar
  • [19] Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.CrossrefGoogle Scholar
  • [20] Giselsson P, Boyd S (2014) Diagonal scaling in Douglas-Rachford splitting and ADMM. 53rd IEEE Conf. Decision Control (IEEE, Los Angeles), 5033–5039.Google Scholar
  • [21] Goldfarb D, Ma S, Scheinberg K (2012) Fast alternating linearization methods of minimization of the sum of two convex functions. Math. Programming Ser. A 141(1):349–382.Google Scholar
  • [22] Goldstein T, Osher S (2009) The split Bregman method for l1-pegularized problems. SIAM J. Imaging Sci. 2(2):323–343.CrossrefGoogle Scholar
  • [23] Grant M, Boyd S, Ye Y (2006) Disciplined convex programming. Liberti L, Maculan N, eds. Global Optimization: From Theory to Implementation, Nonconvex Optimization and its Applications (Springer, Boston), 155–210.CrossrefGoogle Scholar
  • [24] Harchaoui Z, Juditsky A, Nemirovski A (2012) Conditional gradient algorithms for machine learning. NIPS Workshop Optim. Machine Learn. 3(1):3–2.Google Scholar
  • [25] Harman R, Trnovská M (2009) Approximate D-optimal designs of experiments on the convex hull of a finite set of information matrices. Math. Slovaca 59(6):693–704.CrossrefGoogle Scholar
  • [26] Hsieh CJ, Sustik M, Dhillon I, Ravikumar P (2011) Sparse inverse covariance matrix estimation using quadratic approximation. Adv. Neural Inform. Processing Systems, vol. 24, 1–18.Google Scholar
  • [27] Hsieh CJ, Sustik MA, Dhillon IS, Ravikumar PK, Poldrack R (2013) BIG & QUIC: Sparse inverse covariance estimation for a million variables. NIPS, 3165–3173.Google Scholar
  • [28] Ivanoff S, Picard F, Rivoirard V (2016) Adaptive LASSO and group-LASSO for functional Poisson regression. J. Machine Learn. Res. 17(1):1903–1948.Google Scholar
  • [29] Jaggi M (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. J. Machine Learn. Res. 28(1):427–435.Google Scholar
  • [30] Jia J, Xie F, Xu L (2019) Sparse Poisson regression with penalized weighted score function. Electronic J. Statist. 13(2):2898–2920.Google Scholar
  • [31] Karimi S, Vavasis S (2017) IMRO: A proximal quasi-Newton method for solving ℓ1-regularized least squares problems. SIAM J. Optim. 27(2):583–615.CrossrefGoogle Scholar
  • [32] Karimireddy S, Stich S, Jaggi M (2018) Global linear convergence of Newton’s method without strong-convexity or Lipschitz gradients. Preprint, submitted June 1, https://arxiv.org/abs/1806.00413.Google Scholar
  • [33] Kyrillidis A, Karimi R, Tran-Dinh Q, Cevher V (2014) Scalable sparse covariance estimation via self-concordance. Proc. 28th AAAI Conf. Artificial Intelligence, Québec, Canada, 1946–1952.Google Scholar
  • [34] Lam XY, Marron S, Sun D, Toh KC (2018) Fast algorithms for large-scale generalized distance weighted discrimination. J. Comput. Graphical Statist. 27(2):1–12.CrossrefGoogle Scholar
  • [35] Lee J, Sun Y, Saunders M (2014) Proximal Newton-type methods for convex optimization. SIAM J. Optim. 24(3):1420–1443.CrossrefGoogle Scholar
  • [36] Lefkimmiatis S, Unser M (2013) Poisson image reconstruction with Hessian Schatten-norm regularization. IEEE Trans. Image Processing 22(11):4314–4327.CrossrefGoogle Scholar
  • [37] Li L, Toh K (2010) An inexact interior point method for λ -regularized sparse covariance selection. Math. Programming Comput. 2(3):291–315.CrossrefGoogle Scholar
  • [38] Li X, Sun D, Toh KC (2018) A highly efficient semismooth Newton augmented Lagrangian method for solving LASSO problems. SIAM J. Optim. 28(1):433–458.CrossrefGoogle Scholar
  • [39] Lichman M (2017) UCI machine learning repository. https://archive-beta.ics.uci.edu.Google Scholar
  • [40] Lu Z, Pong T (2013) Computing optimal experimental designs via interior point method. SIAM J. Matrix Anal. Appl. 34(4):1556–1580.CrossrefGoogle Scholar
  • [41] Marron SJ, Todd MJ, Ahn J (2007) Distance-weighted discrimination. J. Amer. Statist. Assoc. 102(480):1267–1271.CrossrefGoogle Scholar
  • [42] Nesterov Y (2004) Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87 (Kluwer Academic Publishers).CrossrefGoogle Scholar
  • [43] Nesterov Y (2005) Smooth minimization of non-smooth functions. Math. Programming 103(1):127–152.CrossrefGoogle Scholar
  • [44] Nesterov Y (2008) Accelerating the cubic regularization of Newton’s method on convex problems. Math. Programming 112:159–181.CrossrefGoogle Scholar
  • [45] Nesterov Y (2013) Gradient methods for minimizing composite objective function. Math. Programming 140(1):125–161.CrossrefGoogle Scholar
  • [46] Nesterov Y, Nemirovski A (1994) Interior-point Polynomial Algorithms in Convex Programming (SIAM).CrossrefGoogle Scholar
  • [47] Nesterov Y, Polyak B (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108(1):177–205.CrossrefGoogle Scholar
  • [48] Nesterov Y, Stich SU (2017) Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27(1):110–123.CrossrefGoogle Scholar
  • [49] Nitanda A (2014) Stochastic proximal gradient descent with acceleration techniques. Adv. Neural Inform. Processing Systems, vol. 27, 1574–1582.Google Scholar
  • [50] Ostrovskii DM, Bach F (2021) Finite-sample analysis of M-estimators using self-concordance. Electronic J. Statist. 15(1):326–391.Google Scholar
  • [51] Parikh N, Boyd S (2013) Proximal algorithms. Foundations Trends Optim. 1(3):123–231.Google Scholar
  • [52] Richtárik P, Takáč M (2012) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Programming 144(1–2):1–38.CrossrefGoogle Scholar
  • [53] Rockafellar RT (1970) Convex Analysis, Princeton Mathematics Series, vol. 28 (Princeton University Press).CrossrefGoogle Scholar
  • [54] Schmidt M, Roux N, Bach F (2011) Convergence rates of inexact proximal-gradient methods for convex optimization. Adv. Neural Inform. Processing Systems, vol. 24, 1–9.Google Scholar
  • [55] Shalev-Shwartz S, Zhang T (2013) Stochastic dual coordinate ascent methods for regularized loss minimization. J. Machine Learn. Res. 14:567–599.Google Scholar
  • [56] Shefi R, Teboulle M (2014) Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optim. 24(1):269–297.CrossrefGoogle Scholar
  • [57] Stella L, Themelis A, Patrinos P (2017) Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67(3):443–487.CrossrefGoogle Scholar
  • [58] Su W, Boyd S, Candes E. (2014) A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. Adv. Neural Inform. Processing Systems, 2510–2518.Google Scholar
  • [59] Sun T, Tran-Dinh Q (2019) Generalized self-concordant functions: A recipe for Newton-type methods. Math. Programming 178(1–2):145–213.CrossrefGoogle Scholar
  • [60] Toh KC, Todd M, Tütüncü R (2010) On the implementation and usage of SDPT3—A Matlab software package for semidefinite-quadratic-linear programming. Technical Report 4, National University of Singapore.Google Scholar
  • [61] Tran-Dinh Q, Fercoq O, Cevher V (2018) A smooth primal-dual optimization framework for nonsmooth composite convex minimization. SIAM J. Optim. 28(1):96–134.CrossrefGoogle Scholar
  • [62] Tran-Dinh Q, Kyrillidis A, Cevher V (2014) An inexact proximal path-following algorithm for constrained convex minimization. SIAM J. Optim. 24(4):1718–1745.CrossrefGoogle Scholar
  • [63] Tran-Dinh Q, Kyrillidis A, Cevher V (2015) Composite self-concordant minimization. J. Machine Learn. Res. 15:374–416.Google Scholar
  • [64] Tran-Dinh Q, Sun T, Lu S (2019) Self-concordant inclusions: A unified framework for path-following generalized Newton-type algorithms. Math. Programming 177(1–2):173–223.CrossrefGoogle Scholar
  • [65] Wright SJ (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.CrossrefGoogle Scholar
  • [66] Wright NRSJ, Figueiredo M (2009) Sparse reconstruction by separable approximation. IEEE Trans. Signal Processing 57:2479–2493.CrossrefGoogle Scholar
  • [67] Yang L, Sun D, Toh KC (2015) SDPNAL+: A majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Programming Comput. 7(3):331–366.CrossrefGoogle Scholar
  • [68] Zhao XY, Sun D, Toh KC (2010) A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM J. Optim. 20(4):1737–1765.CrossrefGoogle Scholar
  • [69] Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. B 67(2):301–320.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.