Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization
Published Online:16 Apr 2024https://doi.org/10.1287/moor.2023.0044
References
- [1] (2021) An accelerated coordinate gradient descent algorithm for non-separable composite optimization. J. Optim. Theory Appl. 193:219–246.Crossref, Google Scholar
- [2] (2013) On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4):2037–2060.Crossref, Google Scholar
- [3] (1999) Nonlinear Programming (Athena Scientific, Belmont, MA).Google Scholar
- [4] (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.Crossref, Google Scholar
- [5] (2021) Inexact block coordinate descent methods with application to non-negative matrix factorization. IMA J. Numerical Anal. 31(4):1431–1452.Crossref, Google Scholar
- [6] (2012) Concentration Inequalities: A Nonasymptotic Theory of Independence (Clarendon Press, Oxford, UK).Google Scholar
- [7] (2019) Gradient descent efficiently finds the cubic regularized nonconvex Newton step. SIAM J. Optim. 29(3):2146–2178.Crossref, Google Scholar
- [8] (2023) Scalable subspace methods for derivative-free nonlinear least-squares optimization. Math. Programming 199:461–524.Crossref, Google Scholar
- [9] (2002) Connected components in random graphs with given expected degree sequences. Ann. Combin. 6:125–145.Crossref, Google Scholar
- [10] (2011) The University of Florida sparse matrix collection. ACM Trans. Math. Software 38(1):1–25.Crossref, Google Scholar
- [11] (2019) A coordinate-descent primal-dual algorithm with large step size and possibly nonseparable functions. SIAM J. Optim. 29(1):100–134.Crossref, Google Scholar
- [12] (2015) Accelerated, parallel and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.Crossref, Google Scholar
- [13] (2007) Pathwise coordinate optimization. Ann. Appl. Statist. 1(2):302–332.Crossref, Google Scholar
- [14] (2015) Direct search based on probabilistic descent. SIAM J. Optim. 25(3):1515–1541.Crossref, Google Scholar
- [15] (2000) On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Oper. Res. Lett. 26(3):127–136.Crossref, Google Scholar
- [16] (2021) Proximal gradient methods with adaptive subspace sampling. Math. Oper. Res. 46(4):1303–1323.Link, Google Scholar
- [17] (2018) SEGA: Variance reduction via gradient sketching. Adv. Neural Inform. Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 2082–2093.Google Scholar
- [18] (2014) Sparser Johnson-Lindenstrauss transforms. J. ACM 61(1):1–23.Crossref, Google Scholar
- [19] (2021) A stochastic subspace approach to gradient-free optimization in high dimensions. Comput. Optim. Appl. 79:339–368.Crossref, Google Scholar
- [20] (2022) Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems. Math. Programming 193:195–224.Crossref, Google Scholar
- [21] (2015) On the complexity analysis of randomized block-coordinate descent methods. Math. Programming 152(1–2):615–642.Crossref, Google Scholar
- [22] (2018) Relatively smooth convex optimization by first-order methods and applications. SIAM J. Optim. 28(1):333–354.Crossref, Google Scholar
- [23] (2016) Lecture notes on randomized linear algebra. Preprint, submitted August 16, https://arxiv.org/abs/1608.0448.Google Scholar
- [24] (2022) An SDE perspective on stochastic convex optimization. Preprint, submitted July 6, https://arxiv.org/abs/2207.02750.Google Scholar
- [25] (1997) Machine Learning (McGraw Hill, New York).Google Scholar
- [26] (2013) Random coordinate descent algorithms for multi-agent convex optimization over networks. IEEE Trans. Automatic Control 58(8):2001–2012.Crossref, Google Scholar
- [27] (2013) Efficient parallel coordinate descent algorithm for convex optimization problems with separable constraints: Application to distributed MPC. J. Process Control 23(3):243–253.Crossref, Google Scholar
- [28] (2016) Parallel random coordinate descent methods for composite minimization: Convergence analysis and error bounds. SIAM J. Optim. 26(1):197–226.Crossref, Google Scholar
- [29] (2014) A random coordinate descent algorithm for optimization problems with composite objective function and linear coupled constraints. Comput. Optim. Appl. 57:307–337.Crossref, Google Scholar
- [30] (2021) Randomized sketch descent methods for non-separable linearly constrained optimization. IMA J. Numerical Anal. 41(2):1056–1092.Crossref, Google Scholar
- [31] (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.Crossref, Google Scholar
- [32] (2018) Lectures on Convex Optimization, 2nd ed. (Springer, Berlin).Crossref, Google Scholar
- [33] (2022) Inexact basic tensor methods for some classes of convex optimization problems. Optim. Methods Software 37(3):878–906.Crossref, Google Scholar
- [34] (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108:177–205.Crossref, Google Scholar
- [35] (2014) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Programming 144:1–38.Crossref, Google Scholar
- [36] (1987) Real and Complex Analysis (McGraw-Hill, New York).Google Scholar
- [37] (2022) On random embeddings and their applications to optimization. Unpublished PhD thesis, University of Oxford, Oxford, UK.Google Scholar
- [38] (2009) Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization. J. Optim. Theory Appl. 140:513–535.Crossref, Google Scholar
- [39] (2020) The exact modulus of the generalized Kurdyka-Lojasiewicz property. Unpublished master’s thesis, The University of British Columbia, Vancouver.Google Scholar
- [40] (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.Crossref, Google Scholar

