Alternating and Parallel Proximal Gradient Methods for Nonsmooth, Nonconvex Minimax: A Unified Convergence Analysis

Published Online:https://doi.org/10.1287/moor.2022.0294

References

  • [1] Attouch H, Bolte J (2009) On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Programming 116(1–2):5–16.CrossrefGoogle Scholar
  • [2] Auslender A, Teboulle M (2005) Interior projection-like methods for monotone variational inequalities. Math. Programming 104(1):39–68.CrossrefGoogle Scholar
  • [3] Barazandeh B, Razaviyayn M (2020) Solving non-convex non-differentiable min-max games using proximal gradient method. ICASSP 2020–2020 IEEE Internat. Conf. Acoustics Speech Signal Processing (ICASSP) (IEEE, Piscataway, NJ), 3162–3166.Google Scholar
  • [4] Barazandeh B, Razaviyayn M, Sanjabi M (2019) Training generative networks using random discriminators. 2019 IEEE Data Sci. Workshop (DSW) (IEEE, Piscataway, NJ), 327–332.Google Scholar
  • [5] Bauschke HH, Bolte J, Teboulle M (2016) A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications. Math. Oper. Res. 42(2):330–348.LinkGoogle Scholar
  • [6] Beck A, Teboulle M (2009) Gradient-based algorithms with applications to signal recovery. Palomar DP, Eldar YC, eds. Convex Optimization in Signal Processing and Communications (Cambridge University Press, Cambridge, UK), 42–88.CrossrefGoogle Scholar
  • [7] Ben-Tal A, Nemirovski A (1999) Robust solutions of uncertain linear programs. Oper. Res. Lett. 25(1):1–13.CrossrefGoogle Scholar
  • [8] Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust Optimization, vol. 28 (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • [9] Bertsekas D, Nedic A, Ozdaglar A (2003) Convex Analysis and Optimization, vol. 1 (Athena Scientific, Nashua, NH).Google Scholar
  • [10] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.CrossrefGoogle Scholar
  • [11] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.CrossrefGoogle Scholar
  • [12] Bolte J, Sabach S, Teboulle M (2018) Nonconvex lagrangian-based optimization: Monitoring schemes and global convergence. Math. Oper. Res. 43(4):1210–1232.LinkGoogle Scholar
  • [13] Bolte J, Sabach S, Teboulle M, Vaisbourd Y (2018) First order methods beyond convexity and lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28(3):2131–2151.CrossrefGoogle Scholar
  • [14] Bregman LM (1967) The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7(3):200–217.CrossrefGoogle Scholar
  • [15] Chambolle A, Pock T (2016) On the ergodic convergence rates of a first-order primal–dual algorithm. Math. Programming 159(1):253–287.CrossrefGoogle Scholar
  • [16] Chen Z, Zhou Y, Xu T, Liang Y (2021) Proximal gradient descent-ascent: Variable convergence under KŁ geometry. Internat. Conf. Learn. Representations.Google Scholar
  • [17] Cohen E, Hallak N, Teboulle M (2022) A dynamic alternating direction of multipliers for nonconvex minimization with nonlinear functional equality constraints. J. Optim. Theory Appl. 193(1):324–353.CrossrefGoogle Scholar
  • [18] Cohen E, Sabach S, Teboulle M (2021) Non-Euclidean proximal methods for convex-concave saddle-point problems. J. Appl. Numerical Optim. 3(1):43–60.Google Scholar
  • [19] Danskin JM (1966) The theory of max-min, with applications. SIAM J. Appl. Math. 14(4):641–664.CrossrefGoogle Scholar
  • [20] Daskalakis C, Skoulakis S, Zampetakis M (2021) The complexity of constrained min-max optimization. Proc. 53rd Annual ACM SIGACT Sympos. Theory Comput. (ACM, New York), 1466–1478.Google Scholar
  • [21] Davis D, Drusvyatskiy D (2019) Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29(1):207–239.CrossrefGoogle Scholar
  • [22] Diakonikolas J, Daskalakis C, Jordan MI (2021) Efficient methods for structured nonconvex-nonconcave min-max optimization. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 2746–2754.Google Scholar
  • [23] Drori Y, Sabach S, Teboulle M (2015) A simple algorithm for a class of nonsmooth convex–concave saddle-point problems. Oper. Res. Lett. 43(2):209–214.CrossrefGoogle Scholar
  • [24] Fiez T, Ratliff L, Mazumdar E, Faulkner E, Narang A (2021) Global convergence to local minmax equilibrium in classes of nonconvex zero-sum games. Adv. Neural Inform. Processing Systems 34:29049–29063.Google Scholar
  • [25] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv. Neural Inform. Processing Systems 27:139–144.Google Scholar
  • [26] Korpelevich GM (1976) The extragradient method for finding saddle points and other problems. Matecon 12:747–756.Google Scholar
  • [27] Kurdyka K (1998) On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier (Grenoble) 48(3):769–783.CrossrefGoogle Scholar
  • [28] Lin T, Jin C, Jordan M (2020) On gradient descent ascent for nonconvex-concave minimax problems. Internat. Conf. Machine Learn. (PMLR, New York), 6083–6093.Google Scholar
  • [29] Liu M, Rafique H, Lin Q, Yang T (2021) First-order convergence theory for weakly-convex-weakly-concave min-max problems. J. Machine Learn. Res. 22(1):169.Google Scholar
  • [30] Lojasiewicz S (1963) Une propriété topologique des sous-ensembles analytiques réels. Colloques internationaux du C.N.R.S 117. Les Équations aux Dérivées Partielles (Gauthier-Villars, Paris), 87–89.Google Scholar
  • [31] Lu S, Tsaknakis I, Hong M, Chen Y (2020) Hybrid block successive approximation for one-sided non-convex min-max problems: Algorithms and applications. IEEE Trans. Signal Processing 68:3676–3691.CrossrefGoogle Scholar
  • [32] Mokhtari A, Ozdaglar A, Pattathil S (2020) A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach. Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 1497–1507.Google Scholar
  • [33] Moreau JJ (1965) Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France 93:273–299.CrossrefGoogle Scholar
  • [34] Nemirovski A (2004) Prox-method with rate of convergence o (1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optim. 15(1):229–251.CrossrefGoogle Scholar
  • [35] Nouiehed M, Sanjabi M, Huang T, Lee JD, Razaviyayn M (2019) Solving a class of non-convex min-max games using iterative first order methods. Adv. Neural Inform. Processing Systems 32:14934–14942.Google Scholar
  • [36] Ostrovskii DM, Lowy A, Razaviyayn M (2021) Efficient search of first-order Nash equilibria in nonconvex-concave smooth min-max problems. SIAM J. Optim. 31(4):2508–2538.CrossrefGoogle Scholar
  • [37] Pock T, Cremers D, Bischof H, Chambolle A (2009) An algorithm for minimizing the Mumford–Shah functional. 2009 IEEE 12th Internat. Conf. Comput. Vision (IEEE, Piscataway, NJ), 1133–1140.Google Scholar
  • [38] Rafique H, Liu M, Lin Q, Yang T (2021) Weakly-convex–concave min–max optimization: Provable algorithms and applications in machine learning. Optim. Methods Software 37(3):1087–1121.CrossrefGoogle Scholar
  • [39] Razaviyayn M, Huang T, Lu S, Nouiehed M, Sanjabi M, Hong M (2020) Nonconvex min-max optimization: Applications, challenges, and recent theoretical advances. IEEE Signal Processing Magazine 37(5):55–66.CrossrefGoogle Scholar
  • [40] Rockafellar R (1970) Convex Analysis (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • [41] Rockafellar R, Wets J (2004) Variational Analysis (Springer, Berlin).Google Scholar
  • [42] Teboulle M (2018) A simplified view of first order methods for optimization. Math. Programming 170(1):67–96.CrossrefGoogle Scholar
  • [43] Thekumparampil KK, Jain P, Netrapalli P, Oh S (2019) Efficient algorithms for smooth minimax optimization. Adv. Neural Inform. Processing Systems 32:12680–12691.Google Scholar
  • [44] von Neumann J (1928) Zur theorie der gesellschaftsspiele. Math. Ann. 100(1):295–320.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.