Alternating and Parallel Proximal Gradient Methods for Nonsmooth, Nonconvex Minimax: A Unified Convergence Analysis

Eyal Cohen
Eyal Cohen
[email protected]
School of Mathematical Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
Search for more papers by this author
,
Marc Teboulle
Corresponding Author
Marc Teboulle
[email protected]
https://orcid.org/0000-0002-4228-131X
School of Mathematical Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
Search for more papers by this author

Eyal Cohen

[email protected]

School of Mathematical Sciences, Tel Aviv University, Ramat Aviv 69978, Israel

Search for more papers by this author

Marc Teboulle

Corresponding Author

Marc Teboulle

[email protected]

https://orcid.org/0000-0002-4228-131X

School of Mathematical Sciences, Tel Aviv University, Ramat Aviv 69978, Israel

Search for more papers by this author

Published Online:8 Feb 2024https://doi.org/10.1287/moor.2022.0294

References

[1] Attouch H, Bolte J (2009) On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Programming 116(1–2):5–16.Crossref, Google Scholar
[2] Auslender A, Teboulle M (2005) Interior projection-like methods for monotone variational inequalities. Math. Programming 104(1):39–68.Crossref, Google Scholar
[3] Barazandeh B, Razaviyayn M (2020) Solving non-convex non-differentiable min-max games using proximal gradient method. ICASSP 2020–2020 IEEE Internat. Conf. Acoustics Speech Signal Processing (ICASSP) (IEEE, Piscataway, NJ), 3162–3166.Google Scholar
[4] Barazandeh B, Razaviyayn M, Sanjabi M (2019) Training generative networks using random discriminators. 2019 IEEE Data Sci. Workshop (DSW) (IEEE, Piscataway, NJ), 327–332.Google Scholar
[5] Bauschke HH, Bolte J, Teboulle M (2016) A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications. Math. Oper. Res. 42(2):330–348.Link, Google Scholar
[6] Beck A, Teboulle M (2009) Gradient-based algorithms with applications to signal recovery. Palomar DP, Eldar YC, eds. Convex Optimization in Signal Processing and Communications (Cambridge University Press, Cambridge, UK), 42–88.Crossref, Google Scholar
[7] Ben-Tal A, Nemirovski A (1999) Robust solutions of uncertain linear programs. Oper. Res. Lett. 25(1):1–13.Crossref, Google Scholar
[8] Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust Optimization, vol. 28 (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[9] Bertsekas D, Nedic A, Ozdaglar A (2003) Convex Analysis and Optimization, vol. 1 (Athena Scientific, Nashua, NH).Google Scholar
[10] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.Crossref, Google Scholar
[11] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.Crossref, Google Scholar
[12] Bolte J, Sabach S, Teboulle M (2018) Nonconvex lagrangian-based optimization: Monitoring schemes and global convergence. Math. Oper. Res. 43(4):1210–1232.Link, Google Scholar
[13] Bolte J, Sabach S, Teboulle M, Vaisbourd Y (2018) First order methods beyond convexity and lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3):2131–2151.Crossref, Google Scholar
[14] Bregman LM (1967) The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3):200–217.Crossref, Google Scholar
[15] Chambolle A, Pock T (2016) On the ergodic convergence rates of a first-order primal–dual algorithm. Math. Programming 159(1):253–287.Crossref, Google Scholar
[16] Chen Z, Zhou Y, Xu T, Liang Y (2021) Proximal gradient descent-ascent: Variable convergence under KŁ geometry. Internat. Conf. Learn. Representations.Google Scholar
[17] Cohen E, Hallak N, Teboulle M (2022) A dynamic alternating direction of multipliers for nonconvex minimization with nonlinear functional equality constraints. J. Optim. Theory Appl. 193(1):324–353.Crossref, Google Scholar
[18] Cohen E, Sabach S, Teboulle M (2021) Non-Euclidean proximal methods for convex-concave saddle-point problems. J. Appl. Numerical Optim. 3(1):43–60.Google Scholar
[19] Danskin JM (1966) The theory of max-min, with applications. SIAM J. Appl. Math. 14(4):641–664.Crossref, Google Scholar
[20] Daskalakis C, Skoulakis S, Zampetakis M (2021) The complexity of constrained min-max optimization. Proc. 53rd Annual ACM SIGACT Sympos. Theory Comput. (ACM, New York), 1466–1478.Google Scholar
[21] Davis D, Drusvyatskiy D (2019) Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29(1):207–239.Crossref, Google Scholar
[22] Diakonikolas J, Daskalakis C, Jordan MI (2021) Efficient methods for structured nonconvex-nonconcave min-max optimization. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 2746–2754.Google Scholar
[23] Drori Y, Sabach S, Teboulle M (2015) A simple algorithm for a class of nonsmooth convex–concave saddle-point problems. Oper. Res. Lett. 43(2):209–214.Crossref, Google Scholar
[24] Fiez T, Ratliff L, Mazumdar E, Faulkner E, Narang A (2021) Global convergence to local minmax equilibrium in classes of nonconvex zero-sum games. Adv. Neural Inform. Processing Systems 34:29049–29063.Google Scholar
[25] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv. Neural Inform. Processing Systems 27:139–144.Google Scholar
[26] Korpelevich GM (1976) The extragradient method for finding saddle points and other problems. Matecon 12:747–756.Google Scholar
[27] Kurdyka K (1998) On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier (Grenoble) 48(3):769–783.Crossref, Google Scholar
[28] Lin T, Jin C, Jordan M (2020) On gradient descent ascent for nonconvex-concave minimax problems. Internat. Conf. Machine Learn. (PMLR, New York), 6083–6093.Google Scholar
[29] Liu M, Rafique H, Lin Q, Yang T (2021) First-order convergence theory for weakly-convex-weakly-concave min-max problems. J. Machine Learn. Res. 22(1):169.Google Scholar
[30] Lojasiewicz S (1963) Une propriété topologique des sous-ensembles analytiques réels. Colloques internationaux du C.N.R.S 117. Les Équations aux Dérivées Partielles (Gauthier-Villars, Paris), 87–89.Google Scholar
[31] Lu S, Tsaknakis I, Hong M, Chen Y (2020) Hybrid block successive approximation for one-sided non-convex min-max problems: Algorithms and applications. IEEE Trans. Signal Processing 68:3676–3691.Crossref, Google Scholar
[32] Mokhtari A, Ozdaglar A, Pattathil S (2020) A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1497–1507.Google Scholar
[33] Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France 93:273–299.Crossref, Google Scholar
[34] Nemirovski A (2004) Prox-method with rate of convergence o (1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1):229–251.Crossref, Google Scholar
[35] Nouiehed M, Sanjabi M, Huang T, Lee JD, Razaviyayn M (2019) Solving a class of non-convex min-max games using iterative first order methods. Adv. Neural Inform. Processing Systems 32:14934–14942.Google Scholar
[36] Ostrovskii DM, Lowy A, Razaviyayn M (2021) Efficient search of first-order Nash equilibria in nonconvex-concave smooth min-max problems. SIAM J. Optim. 31(4):2508–2538.Crossref, Google Scholar
[37] Pock T, Cremers D, Bischof H, Chambolle A (2009) An algorithm for minimizing the Mumford–Shah functional. 2009 IEEE 12th Internat. Conf. Comput. Vision (IEEE, Piscataway, NJ), 1133–1140.Google Scholar
[38] Rafique H, Liu M, Lin Q, Yang T (2021) Weakly-convex–concave min–max optimization: Provable algorithms and applications in machine learning. Optim. Methods Software 37(3):1087–1121.Crossref, Google Scholar
[39] Razaviyayn M, Huang T, Lu S, Nouiehed M, Sanjabi M, Hong M (2020) Nonconvex min-max optimization: Applications, challenges, and recent theoretical advances. IEEE Signal Processing Magazine 37(5):55–66.Crossref, Google Scholar
[40] Rockafellar R (1970) Convex Analysis (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
[41] Rockafellar R, Wets J (2004) Variational Analysis (Springer, Berlin).Google Scholar
[42] Teboulle M (2018) A simplified view of first order methods for optimization. Math. Programming 170(1):67–96.Crossref, Google Scholar
[43] Thekumparampil KK, Jain P, Netrapalli P, Oh S (2019) Efficient algorithms for smooth minimax optimization. Adv. Neural Inform. Processing Systems 32:12680–12691.Google Scholar
[44] von Neumann J (1928) Zur theorie der gesellschaftsspiele. Math. Ann. 100(1):295–320.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 50, Issue 1

February 2025

Pages 1-781 C2

Article Information

Metrics

Information

Received:October 10, 2022
Accepted:January 01, 2024
Published Online:February 08, 2024

Cite as

Eyal Cohen, Marc Teboulle (2024) Alternating and Parallel Proximal Gradient Methods for Nonsmooth, Nonconvex Minimax: A Unified Convergence Analysis. Mathematics of Operations Research 50(1):141-168.

https://doi.org/10.1287/moor.2022.0294

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Alternating and Parallel Proximal Gradient Methods for Nonsmooth, Nonconvex Minimax: A Unified Convergence Analysis

References

Volume 50, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News