A New Homotopy Proximal Variable-Metric Framework for Composite Convex Minimization
Published Online:4 Oct 2021https://doi.org/10.1287/moor.2021.1138
References
- [1] (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Machine Learn. Res. 9(15):485–516.Google Scholar
- [2] (2017) Convex Analysis and Monotone Operators Theory in HilbertSspaces, 2nd ed. (Springer-Verlag, New York).Crossref, Google Scholar
- [3] (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Processing 18(11):2419–2434.Crossref, Google Scholar
- [4] (2009) A fast iterative shrinkage-thresholding agorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.Crossref, Google Scholar
- [5] (2012) Smoothing and first order methods: A unified framework. SIAM J. Optim. 22(2):557–580.Crossref, Google Scholar
- [6] (2012) A quasi-Newton proximal splitting method. Adv. Neural Inform. Processing Systems Foundation, vol. 25, 1–8.Google Scholar
- [7] (2011) Templates for convex cone problems with applications to sparse signal recovery. Math. Programming. Comput. 3(3):165–218.Crossref, Google Scholar
- [8] (2011) Incremental gradient, subgradient, and proximal methods for convex optimization: A survey. Sra S, Nowozin S, Wright SJ, eds. Optimization for Machine Learning, 1–38.Google Scholar
- [9] (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations Trends Machine Learn. 3(1):1–122.Crossref, Google Scholar
- [10] (2011) A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision 40(1):120–145.Crossref, Google Scholar
- [11] (2013) Learning heteroscedastic models by convex programming under group sparsity. Proc. Internat. Conf. Machine Learn. (PMLR, Atlanta), 379–387.Google Scholar
- [12] (2015) Convergence rate analysis of the forward-Douglas-Rachford splitting scheme. SIAM J. Optim. 25(3):1760–1786.Crossref, Google Scholar
- [13] (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Adv. Neural Inform. Processing Systems, 1646–1654.Google Scholar
- [14] (2006) Newton Methods for Nonlinear Problems—Affine Invariance and Adaptative Algorithms, Springer Series in Computational Mathematics, vol. 35, 2nd ed. (Springer Science & Business Media).Google Scholar
- [15] (2014) First-order methods of smooth convex optimization with inexact oracle. Math. Programming 146(1–2):37–75.Crossref, Google Scholar
- [16] (2010) Primal-dual algorithm for convex models and applications to image restoration, registration and nonlocal inpainting. Unpublished PhD thesis, University of California, Los Angeles.Google Scholar
- [17] (2015) Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.Crossref, Google Scholar
- [18] (2017) Efficient evaluation of scaled proximal operators. Electronic Trans. Numerical Anal. 46:1–22.Google Scholar
- [19] (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.Crossref, Google Scholar
- [20] (2014) Diagonal scaling in Douglas-Rachford splitting and ADMM. 53rd IEEE Conf. Decision Control (IEEE, Los Angeles), 5033–5039.Google Scholar
- [21] (2012) Fast alternating linearization methods of minimization of the sum of two convex functions. Math. Programming Ser. A 141(1):349–382.Google Scholar
- [22] (2009) The split Bregman method for l1-pegularized problems. SIAM J. Imaging Sci. 2(2):323–343.Crossref, Google Scholar
- [23] (2006) Disciplined convex programming. Liberti L, Maculan N, eds. Global Optimization: From Theory to Implementation, Nonconvex Optimization and its Applications (Springer, Boston), 155–210.Crossref, Google Scholar
- [24] (2012) Conditional gradient algorithms for machine learning. NIPS Workshop Optim. Machine Learn. 3(1):3–2.Google Scholar
- [25] (2009) Approximate D-optimal designs of experiments on the convex hull of a finite set of information matrices. Math. Slovaca 59(6):693–704.Crossref, Google Scholar
- [26] (2011) Sparse inverse covariance matrix estimation using quadratic approximation. Adv. Neural Inform. Processing Systems, vol. 24, 1–18.Google Scholar
- [27] (2013) BIG & QUIC: Sparse inverse covariance estimation for a million variables. NIPS, 3165–3173.Google Scholar
- [28] (2016) Adaptive LASSO and group-LASSO for functional Poisson regression. J. Machine Learn. Res. 17(1):1903–1948.Google Scholar
- [29] (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. J. Machine Learn. Res. 28(1):427–435.Google Scholar
- [30] (2019) Sparse Poisson regression with penalized weighted score function. Electronic J. Statist. 13(2):2898–2920.Google Scholar
- [31] (2017) IMRO: A proximal quasi-Newton method for solving ℓ1-regularized least squares problems. SIAM J. Optim. 27(2):583–615.Crossref, Google Scholar
- [32] (2018) Global linear convergence of Newton’s method without strong-convexity or Lipschitz gradients. Preprint, submitted June 1, https://arxiv.org/abs/1806.00413.Google Scholar
- [33] (2014) Scalable sparse covariance estimation via self-concordance. Proc. 28th AAAI Conf. Artificial Intelligence, Québec, Canada, 1946–1952.Google Scholar
- [34] (2018) Fast algorithms for large-scale generalized distance weighted discrimination. J. Comput. Graphical Statist. 27(2):1–12.Crossref, Google Scholar
- [35] (2014) Proximal Newton-type methods for convex optimization. SIAM J. Optim. 24(3):1420–1443.Crossref, Google Scholar
- [36] (2013) Poisson image reconstruction with Hessian Schatten-norm regularization. IEEE Trans. Image Processing 22(11):4314–4327.Crossref, Google Scholar
- [37] (2010) An inexact interior point method for λ -regularized sparse covariance selection. Math. Programming Comput. 2(3):291–315.Crossref, Google Scholar
- [38] (2018) A highly efficient semismooth Newton augmented Lagrangian method for solving LASSO problems. SIAM J. Optim. 28(1):433–458.Crossref, Google Scholar
- [39] (2017) UCI machine learning repository. https://archive-beta.ics.uci.edu.Google Scholar
- [40] (2013) Computing optimal experimental designs via interior point method. SIAM J. Matrix Anal. Appl. 34(4):1556–1580.Crossref, Google Scholar
- [41] (2007) Distance-weighted discrimination. J. Amer. Statist. Assoc. 102(480):1267–1271.Crossref, Google Scholar
- [42] (2004) Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87 (Kluwer Academic Publishers).Crossref, Google Scholar
- [43] (2005) Smooth minimization of non-smooth functions. Math. Programming 103(1):127–152.Crossref, Google Scholar
- [44] (2008) Accelerating the cubic regularization of Newton’s method on convex problems. Math. Programming 112:159–181.Crossref, Google Scholar
- [45] (2013) Gradient methods for minimizing composite objective function. Math. Programming 140(1):125–161.Crossref, Google Scholar
- [46] (1994) Interior-point Polynomial Algorithms in Convex Programming (SIAM).Crossref, Google Scholar
- [47] (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108(1):177–205.Crossref, Google Scholar
- [48] (2017) Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27(1):110–123.Crossref, Google Scholar
- [49] (2014) Stochastic proximal gradient descent with acceleration techniques. Adv. Neural Inform. Processing Systems, vol. 27, 1574–1582.Google Scholar
- [50] (2021) Finite-sample analysis of M-estimators using self-concordance. Electronic J. Statist. 15(1):326–391.Google Scholar
- [51] (2013) Proximal algorithms. Foundations Trends Optim. 1(3):123–231.Google Scholar
- [52] (2012) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Programming 144(1–2):1–38.Crossref, Google Scholar
- [53] (1970) Convex Analysis, Princeton Mathematics Series, vol. 28 (Princeton University Press).Crossref, Google Scholar
- [54] (2011) Convergence rates of inexact proximal-gradient methods for convex optimization. Adv. Neural Inform. Processing Systems, vol. 24, 1–9.Google Scholar
- [55] (2013) Stochastic dual coordinate ascent methods for regularized loss minimization. J. Machine Learn. Res. 14:567–599.Google Scholar
- [56] (2014) Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization. SIAM J. Optim. 24(1):269–297.Crossref, Google Scholar
- [57] (2017) Forward–backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67(3):443–487.Crossref, Google Scholar
- [58] (2014) A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. Adv. Neural Inform. Processing Systems, 2510–2518.Google Scholar
- [59] (2019) Generalized self-concordant functions: A recipe for Newton-type methods. Math. Programming 178(1–2):145–213.Crossref, Google Scholar
- [60] (2010) On the implementation and usage of SDPT3—A Matlab software package for semidefinite-quadratic-linear programming. Technical Report 4, National University of Singapore.Google Scholar
- [61] (2018) A smooth primal-dual optimization framework for nonsmooth composite convex minimization. SIAM J. Optim. 28(1):96–134.Crossref, Google Scholar
- [62] (2014) An inexact proximal path-following algorithm for constrained convex minimization. SIAM J. Optim. 24(4):1718–1745.Crossref, Google Scholar
- [63] (2015) Composite self-concordant minimization. J. Machine Learn. Res. 15:374–416.Google Scholar
- [64] (2019) Self-concordant inclusions: A unified framework for path-following generalized Newton-type algorithms. Math. Programming 177(1–2):173–223.Crossref, Google Scholar
- [65] (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.Crossref, Google Scholar
- [66] (2009) Sparse reconstruction by separable approximation. IEEE Trans. Signal Processing 57:2479–2493.Crossref, Google Scholar
- [67] (2015) SDPNAL+: A majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Programming Comput. 7(3):331–366.Crossref, Google Scholar
- [68] (2010) A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM J. Optim. 20(4):1737–1765.Crossref, Google Scholar
- [69] (2005) Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. B 67(2):301–320.Crossref, Google Scholar

