Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems

Quoc Tran-Dinh
Corresponding Author
Quoc Tran-Dinh
[email protected]
https://orcid.org/0000-0002-1077-2579
Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
Search for more papers by this author
,
Yang Luo
Yang Luo
[email protected]
Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
Search for more papers by this author

Corresponding Author

Quoc Tran-Dinh

Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599

Search for more papers by this author

Yang Luo

[email protected]

Department of Statistics and Operations Research, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599

Search for more papers by this author

Published Online:2 Apr 2025https://doi.org/10.1287/moor.2024.0414

References

[1] Abhishek VA, Binny S, Johan TR, Nithin R, Vishal T (2022) Federated learning: Collaborative machine learning without centralized training data. Internat. J. Engrg. Technol. Management Sci. 6(5):355–359.Google Scholar
[2] Agarwal RP, Meehan M, O’Regan D (2001) Fixed Point Theory and Applications, vol. 141 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
[3] Alacaoglu A, Tran-Dinh Q, Fercoq O, Cevher V (2017) Smooth primal-dual coordinate descent algorithms for nonsmooth convex optimization. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 1–9.Google Scholar
[4] Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR.org), 214–223.Google Scholar
[5] Attouch H, Cabot A (2020) Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Programmng 184(1):243–287.Crossref, Google Scholar
[6] Attouch H, Peypouquet J, Redont P (2018) Backward–forward algorithms for structured monotone inclusions in Hilbert spaces. J. Math. Anal. Appl. 457(2):1095–1117.Crossref, Google Scholar
[7] Auslender A (1971) Méthodes numériques pour la décomposition et la minimisation de fonctions non différentiables. Numerische Mathematik 18(3):213–223.Crossref, Google Scholar
[8] Bauschke HH, Combettes P (2017) Convex Analysis and Monotone Operators Theory in Hilbert Spaces, 2nd ed. (Springer-Verlag, New York).Crossref, Google Scholar
[9] Beck A, Pauwels E, Sabach S (2015) The cyclic block conditional gradient method for convex optimization problems. SIAM J. Optim. 25(4):2024–2049.Crossref, Google Scholar
[10] Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust Optimization (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[11] Ben-Tal A, Margalit T, Nemirovski A (2001) The ordered subsets mirror descent optimization method with applications to tomography. SIAM J. Optim. 12(1):79–108.Crossref, Google Scholar
[12] Bertsekas D, Tsitsiklis JN (1989) Parallel and Distributed Computation: Numerical Methods (Prentice Hall, Englewood Cliffs, NJ).Google Scholar
[13] Böhm A (2023) Solving nonconvex-nonconcave min-max problems exhibiting weak Minty solutions. Trans. Machine Learn. Res., vol. 39 (Journal of Machine Learning Research Inc., New York).Google Scholar
[14] Bot RI, Sedlmayer M, Vuong PT (2023) A relaxed inertial forward-backward-forward algorithm for solving monotone inclusions with application to GANs. J. Machine Learn. Res. 24(8):1–37.Google Scholar
[15] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
[16] Burachik RS, Iusem A (2008) Set-Valued Mappings and Enlargements of Monotone Operators (Springer, New York).Google Scholar
[17] Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans. Intelligent Systems Tech. 2(3): 1–27.Crossref, Google Scholar
[18] Chen Y, Lan G, Ouyang Y (2017) Accelerated schemes for a class of variational inequalities. Math. Programming 165(1):113–149.Crossref, Google Scholar
[19] Combettes PL, Eckstein J (2018) Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions. Math. Programming 168(1):645–672.Crossref, Google Scholar
[20] Combettes PL, Pesquet JC (2011) Proximal splitting methods in signal processing. Fixed-Point Algorithms for Inverse Problems in Science and Engineering (Springer-Verlag, Berlin), 185–212.Crossref, Google Scholar
[21] Combettes PL, Pesquet JC (2015) Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2):1221–1248.Crossref, Google Scholar
[22] Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model. Simulation 4(4):1168–1200.Crossref, Google Scholar
[23] Daskalakis C, Ilyas A, Syrgkanis V, Zeng H (2018) Training GANs with optimism. 6th Internat. Conf. Learn. Representations (ICLR 2018) (Vancouver).Google Scholar
[24] Davis D (2022) Variance reduction for root-finding problems. Math. Programming 197:375–410.Crossref, Google Scholar
[25] Davis D, Yin W (2017) A three-operator splitting scheme and its optimization applications. Set-Valued Variational Anal. 25(4):829–858.Crossref, Google Scholar
[26] Defazio A, Bach F, Lacoste-Julien S (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 1646–1654.Google Scholar
[27] Diakonikolas J (2020) Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities. Conf. Learn. Theory (PMLR, New York), 1428–1451.Google Scholar
[28] Diakonikolas J, Daskalakis C, Jordan M (2021) Efficient methods for structured nonconvex-nonconcave min-max optimization. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 2746–2754.Google Scholar
[29] Drusvyatskiy D, Lewis A (2018) Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3):919–948.Link, Google Scholar
[30] Du W, Xu D, Wu X, Tong H (2021) Fairness-aware agnostic federated learning. Proc. 2021 SIAM Internat. Conf. Data Mining (SIAM, Philadelphia), 181–189.Google Scholar
[31] Facchinei F, Pang JS (2003) Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. 1–2 (Springer-Verlag, Berlin).Google Scholar
[32] Fercoq O, Richtárik P (2015) Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.Crossref, Google Scholar
[33] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 2672–2680.Google Scholar
[34] Halpern B (1967) Fixed points of nonexpanding maps. Bull. Amer. Math. Soc. 73(6):957–961.Crossref, Google Scholar
[35] Ho E, Rajagopalan A, Skvortsov A, Arulampalam S, Piraveenan M (2022) Game theory in defence applications: A review. Sensors 22(3):1032.Crossref, Google Scholar
[36] Hsieh CJ, Sustik MA, Dhillon IS, Ravikumar PK, Poldrack R (2013) BIG & QUIC: Sparse inverse covariance estimation for a million variables. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Inc., Red Hook, NY), 3165–3173.Google Scholar
[37] Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, et al. (2021) Advances and open problems in federated learning. Found Trends Machine Learn. 14(1–2):1–210.Google Scholar
[38] Kim D (2021) Accelerated proximal point method for maximally monotone operators. Math. Programming 190:57–87.Crossref, Google Scholar
[39] Kim D, Fessler JA (2016) Optimized first-order methods for smooth convex minimization. Math. Programming 159(1–2):81–107.Crossref, Google Scholar
[40] Konečnỳ J, McMahan HB, Ramage D, Richtárik P (2016) Federated optimization: Distributed machine learning for on-device intelligence. Preprint, submitted October 8, https://arxiv.org/abs/1610.02527.Google Scholar
[41] Konnov I (2001) Combined Relaxation Methods for Variational Inequalities (Springer-Verlag, Berlin).Crossref, Google Scholar
[42] Korpelevich G (1976) The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 12(4):747–756.Google Scholar
[43] Kotsalis G, Lan G, Li T (2022) Simple and optimal methods for stochastic variational inequalities, I: Operator extrapolation. SIAM J. Optim. 32(3):2041–2073.Crossref, Google Scholar
[44] Kuhn HW, Harsanyi J, Selten R, Weibul J, van Damme E (1996) The work of John Nash in game theory. J. Econom. Theory 69(1):153–185.Crossref, Google Scholar
[45] Lan G (2020) First-Order and Stochastic Optimization Methods for Machine Learning (Springer, Cham, Switzerland).Crossref, Google Scholar
[46] Latafat P, Themelis A, Patrinos P (2019) Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems. Math. Programming 193:195–224.Crossref, Google Scholar
[47] Lee S, Kim D (2021) Fast extra gradient methods for smooth structured nonconvex-nonconcave minimax problems. Advances in Neural Information Processing Systems, vol. 34 (Curran Associates Inc., Red Hook, NY), 16924–16936.Google Scholar
[48] Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine 37(3):50–60.Crossref, Google Scholar
[49] Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020) Federated optimization in heterogeneous networks. Dhillon I, Papailiopoulos D, Sze V, eds. Proceedings of Machine Learning and Systems, vol. 2, 429–450.Google Scholar
[50] Lieder F (2021) On the convergence rate of the Halpern-iteration. Optim. Lett. 15(2):405–418.Crossref, Google Scholar
[51] Lions PL, Mercier B (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numerical Anal. 16(6):964–979.Crossref, Google Scholar
[52] Luo ZQ, Tseng P (1993) Error bounds and convergence analysis of feasible descent methods: A general approach. Ann. Oper. Res. 46(1):157–178.Crossref, Google Scholar
[53] Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. Proc. Internat. Conf. Learn. Representations (ICLR) (Vancouver).Google Scholar
[54] Maingé PE (2021) Accelerated proximal algorithms with a correction term for monotone inclusions. Appl. Math. Optim. 84(2):2027–2061.Crossref, Google Scholar
[55] Maingé PE (2022) Fast convergence of generalized forward-backward algorithms for structured monotone inclusions. J. Convex Anal. 29:893–920.Google Scholar
[56] Malitsky Y (2015) Projected reflected gradient methods for monotone variational inequalities. SIAM J. Optim. 25(1):502–520.Crossref, Google Scholar
[57] Malitsky Y, Tam MK (2020) A forward-backward splitting method for monotone inclusions without cocoercivity. SIAM J. Optim. 30(2):1451–1472.Crossref, Google Scholar
[58] Martinez N, Bertran M, Sapiro G (2020) Minimax pareto fairness: A multi objective perspective. Internat. Conf. Machine Learn. (PMLR, New York), 6755–6764.Google Scholar
[59] McMahan HB, Moore E, Ramage D, Hampson S, Aguera y Arcas B (2017) Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence Statist. (PMLR, New York), 1273–1282.Google Scholar
[60] Mokhtari A, Ozdaglar A, Pattathil S (2020) A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1497–1507.Google Scholar
[61] Namkoong H, Duchi JC (2016) Stochastic gradient methods for distributionally robust optimization with f-divergences. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates Inc., Red Hook, NY).Google Scholar
[62] Nedíc A, Ozdaglar A (2009) Distributed subgradient methods for multi-agent optimization. IEEE Trans. Automatic Control 54(1):48–61.Crossref, Google Scholar
[63] Nemirovskii A (2004) Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1):229–251.Crossref, Google Scholar
[64] Nesterov Y (1983) A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Soviet Math. Doklady 269:543–547.Google Scholar
[65] Nesterov Y (2004) Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87 (Kluwer Academic Publishers, Boston).Crossref, Google Scholar
[66] Nesterov Y (2007) Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Programming 109(2–3):319–344.Crossref, Google Scholar
[67] Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.Crossref, Google Scholar
[68] Nesterov Y, Stich SU (2017) Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27(1):110–123.Crossref, Google Scholar
[69] Ortega JM, Rheinboldt WC (2000) Iterative Solution of Nonlinear Equations in Several Variables (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
[70] Ouyang Y, Xu Y (2021) Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Math. Programming 185(1–2):1–35.Crossref, Google Scholar
[71] Peng Z, Xu Y, Yan M, Yin W (2016) ARock: An algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38(5):2851–2879.Crossref, Google Scholar
[72] Pethick T, Patrinos P, Fercoq O, Cevher V (2022) Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems. Proc. 10th Internat. Conf. Learn. Representations (ICLR 2022) (Virtual Conference: International Conference on Learning Representations).Google Scholar
[73] Peypouquet J, Sorin S (2010) Evolution equations for maximal monotone operators: Asymptotic analysis in continuous and discrete time. J. Convex Anal. (Oxford) 17(3 & 4):1113–1163.Google Scholar
[74] Phelps RR (2009) Convex Functions, Monotone Operators and Differentiability, vol. 1364 (Springer, New York).Google Scholar
[75] Popov LD (1980) A modification of the Arrow-Hurwicz method for search of saddle points. Math. Notes Acad. of Sci. USSR 28(5):845–848.Google Scholar
[76] Rahimian H, Mehrotra S (2019) Distributionally robust optimization: A review. Preprint, submitted August 13, https://arxiv.org/abs/1908.05659.Google Scholar
[77] Richtárik P, Takáč M (2016) Parallel coordinate descent methods for big data optimization. Math. Programming 156(1–2):433–484.Crossref, Google Scholar
[78] Robbins H, Siegmund D (1971) A convergence theorem for non negative almost supermartingales and some applications. Rustagi JS, ed. Optimizing Methods in Statistics (Academic Press, New York), 233–257.Google Scholar
[79] Rockafellar R (1976) Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5):877–898.Crossref, Google Scholar
[80] Rockafellar RT, Wets RJ-B, eds. (2004) Variational Analysis, vol. 317 (Springer-Verlag, Berlin).Google Scholar
[81] Ryu EK, Boyd S (2016) Primer on monotone operator methods. Appl. Comput. Math. 15(1):3–43.Google Scholar
[82] Ryu EK, Yin W (2022) Large-Scale Convex Optimization: Algorithms & Analyses via Monotone Operators (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
[83] Sharma P, Panda R, Joshi G, Varshney P (2022) Federated minimax optimization: Improved convergence analyses and algorithms. Internat. Conf. Machine Learn. (PMLR, New York), 19683–19730.Google Scholar
[84] Sra S, Nowozin S, Wright SJ (2012) Optimization for Machine Learning (MIT Press, Cambridge, MA).Google Scholar
[85] Tarzanagh DA, Li M, Thrampoulidis C, Oymak S (2022) Fednest: Federated bilevel, minimax, and compositional optimization. Internat. Conf. Machine Learn. (PMLR, New York), 21146–21179.Google Scholar
[86] Tran-Dinh Q (2023) Extragradient-type methods with O(1/k)-convergence rates for co-hypomonotone inclusions. J. Global Optim. 89(1):197–221.Crossref, Google Scholar
[87] Tran-Dinh Q (2024) From Halpern’s fixed-point iterations to Nesterov’s accelerated interpretations for root-finding problems. Comput. Optim. Appl. 87(1):181–218.Crossref, Google Scholar
[88] Tran-Dinh Q, Liu D (2022) A new randomized primal-dual algorithm for convex optimization with optimal last iterate rates. Optim. Methods Software 38(1):184–217.Crossref, Google Scholar
[89] Tran-Dinh Q, Luo Y (2021) Halpern-type accelerated and splitting algorithms for monotone inclusions. Preprint, submitted October 15, https://arxiv.org/abs/2110.08150.Google Scholar
[90] Tran-Dinh Q, Pham NH, Phan DT, Nguyen LM (2021) FedDR—Randomized Douglas-Rachford splitting algorithms for nonconvex federated composite optimization. Advances in Neural Information Processing Systems, 1–39.Google Scholar
[91] Tseng P (2000) A modified forward-backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2):431–446.Crossref, Google Scholar
[92] Wright SJ (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.Crossref, Google Scholar
[93] Wright SJ (2017) Optimization algorithms for data analysis. IAS/Park City Mathematics Series, vol. 25 (American Mathematical Society, Providence, RI), 1–49.Google Scholar
[94] Yoon T, Ryu EK (2021) Accelerated algorithms for smooth convex-concave minimax problems with O(1/k2) rate on squared gradient norm. Internat. Conf. Machine Learn. (PMLR, New York), 12098–12109.Google Scholar

cover image Mathematics of Operations Research

Volume 51, Issue 1

February 2026

Pages iv-viii, 1-851

Article Information

Metrics

Information

Received:February 15, 2024
Accepted:February 20, 2025
Published Online:April 02, 2025

Cite as

Quoc Tran-Dinh, Yang Luo (2025) Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems. Mathematics of Operations Research 51(1):746-782.

https://doi.org/10.1287/moor.2024.0414

Keywords

Acknowledgments

The authors thank Mr. Nghia Nguyen-Trung for his implementation and experiments on the last example in this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems

References

Volume 51, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News