Randomized Block-Coordinate Optimistic Gradient Algorithms for Root-Finding Problems
Published Online:2 Apr 2025https://doi.org/10.1287/moor.2024.0414
References
- [1] Abhishek VA, Binny S, Johan TR, Nithin R, Vishal T (2022) Federated learning: Collaborative machine learning without centralized training data. Internat. J. Engrg. Technol. Management Sci. 6(5):355–359.Google Scholar
- [2] (2001) Fixed Point Theory and Applications, vol. 141 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- [3] (2017) Smooth primal-dual coordinate descent algorithms for nonsmooth convex optimization. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY), 1–9.Google Scholar
- [4] (2017) Wasserstein generative adversarial networks. Proc. 34th Internat. Conf. Machine Learn., vol. 70 (JMLR.org), 214–223.Google Scholar
- [5] (2020) Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Programmng 184(1):243–287.Crossref, Google Scholar
- [6] (2018) Backward–forward algorithms for structured monotone inclusions in Hilbert spaces. J. Math. Anal. Appl. 457(2):1095–1117.Crossref, Google Scholar
- [7] (1971) Méthodes numériques pour la décomposition et la minimisation de fonctions non différentiables. Numerische Mathematik 18(3):213–223.Crossref, Google Scholar
- [8] (2017) Convex Analysis and Monotone Operators Theory in Hilbert Spaces, 2nd ed. (Springer-Verlag, New York).Crossref, Google Scholar
- [9] (2015) The cyclic block conditional gradient method for convex optimization problems. SIAM J. Optim. 25(4):2024–2049.Crossref, Google Scholar
- [10] (2009) Robust Optimization (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
- [11] (2001) The ordered subsets mirror descent optimization method with applications to tomography. SIAM J. Optim. 12(1):79–108.Crossref, Google Scholar
- [12] (1989) Parallel and Distributed Computation: Numerical Methods (Prentice Hall, Englewood Cliffs, NJ).Google Scholar
- [13] (2023) Solving nonconvex-nonconcave min-max problems exhibiting weak Minty solutions. Trans. Machine Learn. Res., vol. 39 (Journal of Machine Learning Research Inc., New York).Google Scholar
- [14] (2023) A relaxed inertial forward-backward-forward algorithm for solving monotone inclusions with application to GANs. J. Machine Learn. Res. 24(8):1–37.Google Scholar
- [15] (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
- [16] (2008) Set-Valued Mappings and Enlargements of Monotone Operators (Springer, New York).Google Scholar
- [17] (2011) LIBSVM: A library for support vector machines. ACM Trans. Intelligent Systems Tech. 2(3): 1–27.Crossref, Google Scholar
- [18] (2017) Accelerated schemes for a class of variational inequalities. Math. Programming 165(1):113–149.Crossref, Google Scholar
- [19] (2018) Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions. Math. Programming 168(1):645–672.Crossref, Google Scholar
- [20] (2011) Proximal splitting methods in signal processing. Fixed-Point Algorithms for Inverse Problems in Science and Engineering (Springer-Verlag, Berlin), 185–212.Crossref, Google Scholar
- [21] (2015) Stochastic quasi-Fejér block-coordinate fixed point iterations with random sweeping. SIAM J. Optim. 25(2):1221–1248.Crossref, Google Scholar
- [22] (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model. Simulation 4(4):1168–1200.Crossref, Google Scholar
- [23] (2018) Training GANs with optimism. 6th Internat. Conf. Learn. Representations (ICLR 2018) (Vancouver).Google Scholar
- [24] (2022) Variance reduction for root-finding problems. Math. Programming 197:375–410.Crossref, Google Scholar
- [25] (2017) A three-operator splitting scheme and its optimization applications. Set-Valued Variational Anal. 25(4):829–858.Crossref, Google Scholar
- [26] (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 1646–1654.Google Scholar
- [27] (2020) Halpern iteration for near-optimal and parameter-free monotone inclusion and strong solutions to variational inequalities. Conf. Learn. Theory (PMLR, New York), 1428–1451.Google Scholar
- [28] (2021) Efficient methods for structured nonconvex-nonconcave min-max optimization. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 2746–2754.Google Scholar
- [29] (2018) Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3):919–948.Link, Google Scholar
- [30] (2021) Fairness-aware agnostic federated learning. Proc. 2021 SIAM Internat. Conf. Data Mining (SIAM, Philadelphia), 181–189.Google Scholar
- [31] (2003) Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. 1–2 (Springer-Verlag, Berlin).Google Scholar
- [32] (2015) Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25(4):1997–2023.Crossref, Google Scholar
- [33] (2014) Generative adversarial nets. Advances in Neural Information Processing Systems, vol. 27 (Curran Associates, Inc., Red Hook, NY), 2672–2680.Google Scholar
- [34] (1967) Fixed points of nonexpanding maps. Bull. Amer. Math. Soc. 73(6):957–961.Crossref, Google Scholar
- [35] (2022) Game theory in defence applications: A review. Sensors 22(3):1032.Crossref, Google Scholar
- [36] (2013) BIG & QUIC: Sparse inverse covariance estimation for a million variables. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Inc., Red Hook, NY), 3165–3173.Google Scholar
- [37] Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, et al. (2021) Advances and open problems in federated learning. Found Trends Machine Learn. 14(1–2):1–210.Google Scholar
- [38] (2021) Accelerated proximal point method for maximally monotone operators. Math. Programming 190:57–87.Crossref, Google Scholar
- [39] (2016) Optimized first-order methods for smooth convex minimization. Math. Programming 159(1–2):81–107.Crossref, Google Scholar
- [40] (2016) Federated optimization: Distributed machine learning for on-device intelligence. Preprint, submitted October 8, https://arxiv.org/abs/1610.02527.Google Scholar
- [41] (2001) Combined Relaxation Methods for Variational Inequalities (Springer-Verlag, Berlin).Crossref, Google Scholar
- [42] (1976) The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody 12(4):747–756.Google Scholar
- [43] (2022) Simple and optimal methods for stochastic variational inequalities, I: Operator extrapolation. SIAM J. Optim. 32(3):2041–2073.Crossref, Google Scholar
- [44] (1996) The work of John Nash in game theory. J. Econom. Theory 69(1):153–185.Crossref, Google Scholar
- [45] (2020) First-Order and Stochastic Optimization Methods for Machine Learning (Springer, Cham, Switzerland).Crossref, Google Scholar
- [46] (2019) Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems. Math. Programming 193:195–224.Crossref, Google Scholar
- [47] (2021) Fast extra gradient methods for smooth structured nonconvex-nonconcave minimax problems. Advances in Neural Information Processing Systems, vol. 34 (Curran Associates Inc., Red Hook, NY), 16924–16936.Google Scholar
- [48] (2020) Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine 37(3):50–60.Crossref, Google Scholar
- [49] (2020) Federated optimization in heterogeneous networks. Dhillon I, Papailiopoulos D, Sze V, eds. Proceedings of Machine Learning and Systems, vol. 2, 429–450.Google Scholar
- [50] (2021) On the convergence rate of the Halpern-iteration. Optim. Lett. 15(2):405–418.Crossref, Google Scholar
- [51] (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numerical Anal. 16(6):964–979.Crossref, Google Scholar
- [52] (1993) Error bounds and convergence analysis of feasible descent methods: A general approach. Ann. Oper. Res. 46(1):157–178.Crossref, Google Scholar
- [53] (2018) Towards deep learning models resistant to adversarial attacks. Proc. Internat. Conf. Learn. Representations (ICLR) (Vancouver).Google Scholar
- [54] (2021) Accelerated proximal algorithms with a correction term for monotone inclusions. Appl. Math. Optim. 84(2):2027–2061.Crossref, Google Scholar
- [55] (2022) Fast convergence of generalized forward-backward algorithms for structured monotone inclusions. J. Convex Anal. 29:893–920.Google Scholar
- [56] (2015) Projected reflected gradient methods for monotone variational inequalities. SIAM J. Optim. 25(1):502–520.Crossref, Google Scholar
- [57] (2020) A forward-backward splitting method for monotone inclusions without cocoercivity. SIAM J. Optim. 30(2):1451–1472.Crossref, Google Scholar
- [58] (2020) Minimax pareto fairness: A multi objective perspective. Internat. Conf. Machine Learn. (PMLR, New York), 6755–6764.Google Scholar
- [59] (2017) Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence Statist. (PMLR, New York), 1273–1282.Google Scholar
- [60] (2020) A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1497–1507.Google Scholar
- [61] (2016) Stochastic gradient methods for distributionally robust optimization with f-divergences. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates Inc., Red Hook, NY).Google Scholar
- [62] (2009) Distributed subgradient methods for multi-agent optimization. IEEE Trans. Automatic Control 54(1):48–61.Crossref, Google Scholar
- [63] (2004) Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1):229–251.Crossref, Google Scholar
- [64] (1983) A method for unconstrained convex minimization problem with the rate of convergence O(1/k2). Soviet Math. Doklady 269:543–547.Google Scholar
- [65] (2004) Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87 (Kluwer Academic Publishers, Boston).Crossref, Google Scholar
- [66] (2007) Dual extrapolation and its applications to solving variational inequalities and related problems. Math. Programming 109(2–3):319–344.Crossref, Google Scholar
- [67] (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.Crossref, Google Scholar
- [68] (2017) Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27(1):110–123.Crossref, Google Scholar
- [69] (2000) Iterative Solution of Nonlinear Equations in Several Variables (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
- [70] (2021) Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Math. Programming 185(1–2):1–35.Crossref, Google Scholar
- [71] (2016) ARock: An algorithmic framework for asynchronous parallel coordinate updates. SIAM J. Sci. Comput. 38(5):2851–2879.Crossref, Google Scholar
- [72] (2022) Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems. Proc. 10th Internat. Conf. Learn. Representations (ICLR 2022) (Virtual Conference: International Conference on Learning Representations).Google Scholar
- [73] (2010) Evolution equations for maximal monotone operators: Asymptotic analysis in continuous and discrete time. J. Convex Anal. (Oxford) 17(3 & 4):1113–1163.Google Scholar
- [74] (2009) Convex Functions, Monotone Operators and Differentiability, vol. 1364 (Springer, New York).Google Scholar
- [75] (1980) A modification of the Arrow-Hurwicz method for search of saddle points. Math. Notes Acad. of Sci. USSR 28(5):845–848.Google Scholar
- [76] (2019) Distributionally robust optimization: A review. Preprint, submitted August 13, https://arxiv.org/abs/1908.05659.Google Scholar
- [77] (2016) Parallel coordinate descent methods for big data optimization. Math. Programming 156(1–2):433–484.Crossref, Google Scholar
- [78] (1971) A convergence theorem for non negative almost supermartingales and some applications. Rustagi JS, ed. Optimizing Methods in Statistics (Academic Press, New York), 233–257.Google Scholar
- [79] (1976) Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5):877–898.Crossref, Google Scholar
- [80] Rockafellar RT, Wets RJ-B, eds. (2004) Variational Analysis, vol. 317 (Springer-Verlag, Berlin).Google Scholar
- [81] (2016) Primer on monotone operator methods. Appl. Comput. Math. 15(1):3–43.Google Scholar
- [82] (2022) Large-Scale Convex Optimization: Algorithms & Analyses via Monotone Operators (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- [83] (2022) Federated minimax optimization: Improved convergence analyses and algorithms. Internat. Conf. Machine Learn. (PMLR, New York), 19683–19730.Google Scholar
- [84] (2012) Optimization for Machine Learning (MIT Press, Cambridge, MA).Google Scholar
- [85] (2022) Fednest: Federated bilevel, minimax, and compositional optimization. Internat. Conf. Machine Learn. (PMLR, New York), 21146–21179.Google Scholar
- [86] (2023) Extragradient-type methods with O(1/k)-convergence rates for co-hypomonotone inclusions. J. Global Optim. 89(1):197–221.Crossref, Google Scholar
- [87] (2024) From Halpern’s fixed-point iterations to Nesterov’s accelerated interpretations for root-finding problems. Comput. Optim. Appl. 87(1):181–218.Crossref, Google Scholar
- [88] (2022) A new randomized primal-dual algorithm for convex optimization with optimal last iterate rates. Optim. Methods Software 38(1):184–217.Crossref, Google Scholar
- [89] (2021) Halpern-type accelerated and splitting algorithms for monotone inclusions. Preprint, submitted October 15, https://arxiv.org/abs/2110.08150.Google Scholar
- [90] (2021) FedDR—Randomized Douglas-Rachford splitting algorithms for nonconvex federated composite optimization. Advances in Neural Information Processing Systems, 1–39.Google Scholar
- [91] (2000) A modified forward-backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2):431–446.Crossref, Google Scholar
- [92] (2015) Coordinate descent algorithms. Math. Programming 151(1):3–34.Crossref, Google Scholar
- [93] (2017) Optimization algorithms for data analysis. IAS/Park City Mathematics Series, vol. 25 (American Mathematical Society, Providence, RI), 1–49.Google Scholar
- [94] (2021) Accelerated algorithms for smooth convex-concave minimax problems with O(1/k2) rate on squared gradient norm. Internat. Conf. Machine Learn. (PMLR, New York), 12098–12109.Google Scholar

