Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization

Ion Necoara
Corresponding Author
Ion Necoara
[email protected]
https://orcid.org/0000-0003-1102-2654
Automatic Control and Systems Engineering Department, University Politehnica Bucharest, 060042 Bucharest, Romania; and Gheorghe Mihoc-Caius Iacob Institute of Mathematical Statistics and Applied Mathematics of the Romanian Academy, 050711 Bucharest, Romania
Search for more papers by this author
,
Flavia Chorobura
Flavia Chorobura
[email protected]
Automatic Control and Systems Engineering Department, University Politehnica Bucharest, 060042 Bucharest, Romania
Search for more papers by this author

Corresponding Author

Ion Necoara

Automatic Control and Systems Engineering Department, University Politehnica Bucharest, 060042 Bucharest, Romania; and Gheorghe Mihoc-Caius Iacob Institute of Mathematical Statistics and Applied Mathematics of the Romanian Academy, 050711 Bucharest, Romania

Search for more papers by this author

Flavia Chorobura

[email protected]

Automatic Control and Systems Engineering Department, University Politehnica Bucharest, 060042 Bucharest, Romania

Search for more papers by this author

Published Online:16 Apr 2024https://doi.org/10.1287/moor.2023.0044

References

[1] Aberdam A, Beck A (2021) An accelerated coordinate gradient descent algorithm for non-separable composite optimization. J. Optim. Theory Appl. 193:219–246.Crossref, Google Scholar
[2] Beck A, Tetruashvili L (2013) On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4):2037–2060.Crossref, Google Scholar
[3] Bertsekas D (1999) Nonlinear Programming (Athena Scientific, Belmont, MA).Google Scholar
[4] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.Crossref, Google Scholar
[5] Bonettini S (2021) Inexact block coordinate descent methods with application to non-negative matrix factorization. IMA J. Numerical Anal. 31(4):1431–1452.Crossref, Google Scholar
[6] Boucheron S, Lugosi G, Massart P (2012) Concentration Inequalities: A Nonasymptotic Theory of Independence (Clarendon Press, Oxford, UK).Google Scholar
[7] Carmon Y, Duchi JC (2019) Gradient descent efficiently finds the cubic regularized nonconvex Newton step. SIAM J. Optim. 29(3):2146–2178.Crossref, Google Scholar
[8] Cartis C, Roberts L (2023) Scalable subspace methods for derivative-free nonlinear least-squares optimization. Math. Programming 199:461–524.Crossref, Google Scholar
[9] Chung F, Lu L (2002) Connected components in random graphs with given expected degree sequences. Ann. Combin. 6:125–145.Crossref, Google Scholar
[10] Davis TA, Hu Y (2011) The University of Florida sparse matrix collection. ACM Trans. Math. Software 38(1):1–25.Crossref, Google Scholar
[11] Fercoq O, Bianchi P (2019) A coordinate-descent primal-dual algorithm with large step size and possibly nonseparable functions. SIAM J. Optim. 29(1):100–134.Crossref, Google Scholar
[12] Fercoq O, Richtarik P (2015) Accelerated, parallel and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.Crossref, Google Scholar
[13] Friedman J, Hastie T, Hofling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann. Appl. Statist. 1(2):302–332.Crossref, Google Scholar
[14] Gratton S, Royer CW, Vicente LN, Zhang Z (2015) Direct search based on probabilistic descent. SIAM J. Optim. 25(3):1515–1541.Crossref, Google Scholar
[15] Grippo L, Sciandrone M (2000) On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Oper. Res. Lett. 26(3):127–136.Crossref, Google Scholar
[16] Grishchenko D, Iutzeler F, Malick J (2021) Proximal gradient methods with adaptive subspace sampling. Math. Oper. Res. 46(4):1303–1323.Link, Google Scholar
[17] Hanzely F, Mishchenko K, Richtarik P (2018) SEGA: Variance reduction via gradient sketching. Adv. Neural Inform. Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 2082–2093.Google Scholar
[18] Kane DM, Nelson J (2014) Sparser Johnson-Lindenstrauss transforms. J. ACM 61(1):1–23.Crossref, Google Scholar
[19] Kozak D, Becker S, Doostan A, Tenorio L (2021) A stochastic subspace approach to gradient-free optimization in high dimensions. Comput. Optim. Appl. 79:339–368.Crossref, Google Scholar
[20] Latafat P, Themelis A, Patrinos P (2022) Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems. Math. Programming 193:195–224.Crossref, Google Scholar
[21] Lu Z, Xiao L (2015) On the complexity analysis of randomized block-coordinate descent methods. Math. Programming 152(1–2):615–642.Crossref, Google Scholar
[22] Lu H, Freund RM, Nesterov Yu (2018) Relatively smooth convex optimization by first-order methods and applications. SIAM J. Optim. 28(1):333–354.Crossref, Google Scholar
[23] Mahoney M (2016) Lecture notes on randomized linear algebra. Preprint, submitted August 16, https://arxiv.org/abs/1608.0448.Google Scholar
[24] Maulen-Soto R, Fadili J, Attouch H (2022) An SDE perspective on stochastic convex optimization. Preprint, submitted July 6, https://arxiv.org/abs/2207.02750.Google Scholar
[25] Mitchell T (1997) Machine Learning (McGraw Hill, New York).Google Scholar
[26] Necoara I (2013) Random coordinate descent algorithms for multi-agent convex optimization over networks. IEEE Trans. Automatic Control 58(8):2001–2012.Crossref, Google Scholar
[27] Necoara I, Clipici D (2013) Efficient parallel coordinate descent algorithm for convex optimization problems with separable constraints: Application to distributed MPC. J. Process Control 23(3):243–253.Crossref, Google Scholar
[28] Necoara I, Clipici D (2016) Parallel random coordinate descent methods for composite minimization: Convergence analysis and error bounds. SIAM J. Optim. 26(1):197–226.Crossref, Google Scholar
[29] Necoara I, Patrascu A (2014) A random coordinate descent algorithm for optimization problems with composite objective function and linear coupled constraints. Comput. Optim. Appl. 57:307–337.Crossref, Google Scholar
[30] Necoara I, Takac M (2021) Randomized sketch descent methods for non-separable linearly constrained optimization. IMA J. Numerical Anal. 41(2):1056–1092.Crossref, Google Scholar
[31] Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.Crossref, Google Scholar
[32] Nesterov Y (2018) Lectures on Convex Optimization, 2nd ed. (Springer, Berlin).Crossref, Google Scholar
[33] Nesterov Y (2022) Inexact basic tensor methods for some classes of convex optimization problems. Optim. Methods Software 37(3):878–906.Crossref, Google Scholar
[34] Nesterov Y, Polyak BT (2006) Cubic regularization of Newton method and its global performance. Math. Programming 108:177–205.Crossref, Google Scholar
[35] Richtarik P, Takac M (2014) Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Programming 144:1–38.Crossref, Google Scholar
[36] Rudin W (1987) Real and Complex Analysis (McGraw-Hill, New York).Google Scholar
[37] Shao Z (2022) On random embeddings and their applications to optimization. Unpublished PhD thesis, University of Oxford, Oxford, UK.Google Scholar
[38] Tseng P, Yun S (2009) Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization. J. Optim. Theory Appl. 140:513–535.Crossref, Google Scholar
[39] Wang Z (2020) The exact modulus of the generalized Kurdyka-Lojasiewicz property. Unpublished master’s thesis, The University of British Columbia, Vancouver.Google Scholar
[40] Wright SJ (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 50, Issue 2

May 2025

Pages iii, 783-1583

Article Information

Metrics

Information

Received:February 15, 2023
Accepted:March 25, 2024
Published Online:April 16, 2024

Cite as

Ion Necoara, Flavia Chorobura; (2024) Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization. Mathematics of Operations Research 50(2):993-1018.

https://doi.org/10.1287/moor.2023.0044

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization

References

Volume 50, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News