The Iterates of the Frank–Wolfe Algorithm May Not Converge

Published Online:https://doi.org/10.1287/moor.2022.0057

References

  • [1] Attouch H, Bolte J, Svaiter BF (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Programming 137(1):91–129.CrossrefGoogle Scholar
  • [2] Attouch H, Bolte J, Redont P, Soubeyran A (2010) Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2):438–457.LinkGoogle Scholar
  • [3] Beck A (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [4] Bertsekas DP (1999) Nonlinear Programming, 2nd ed. (Athena Scientific, Nashua, NH).Google Scholar
  • [5] Bolte J, Pauwels E (2022) Curiosities and counterexamples in smooth convex optimization. Math. Programming 195(1–2):553–603.CrossrefGoogle Scholar
  • [6] Bolte J, Daniilidis A, Lewis A (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.CrossrefGoogle Scholar
  • [7] Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.CrossrefGoogle Scholar
  • [8] Bolte J, Daniilidis A, Lewis A, Shiota M (2007) Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2):556–572.CrossrefGoogle Scholar
  • [9] Bolte J, Daniilidis A, Ley O, Mazet L (2010) Characterizations of Łojasiewicz inequalities and applications. Trans. Amer. Math. Soc. 362(6):3319–3363.CrossrefGoogle Scholar
  • [10] Bolte J, Nguyen TP, Peypouquet J, Suter BW (2017) From error bounds to the complexity of first-order descent methods for convex functions. Math. Programming 165(2):471–507.CrossrefGoogle Scholar
  • [11] Braun G, Pokutta S, Tu D, Wright S (2019) Blended conditional gradients: The unconditioning of conditional gradients. Proc. 36th Internat. Conf. Machine Learn. (PMLR, New York), 735–743.Google Scholar
  • [12] Canon MD, Cullum CD (1968) A tight upper bound on the rate of convergence of Frank-Wolfe algorithm. SIAM J. Control 6(4):509–516.CrossrefGoogle Scholar
  • [13] Clarkson KL (2010) Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Trans. Algorithms 6(4):1–30.CrossrefGoogle Scholar
  • [14] Combettes CW, Pokutta S (2020) Boosting Frank-Wolfe by chasing gradients. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 2111–2121.Google Scholar
  • [15] Combettes CW, Pokutta S (2021) Complexity of linear minimization and projection on some sets. Oper. Res. Lett. 49(4):565–571.CrossrefGoogle Scholar
  • [16] Combettes CW, Pokutta S (2023) Revisiting the approximate Carathéodory problem via the Frank-Wolfe algorithm. Math. Programming 197(1):191–214.CrossrefGoogle Scholar
  • [17] Combettes PL, Pesquet JC (2011) Proximal splitting methods in signal processing. Fixed-Point Algorithms for Inverse Problems in Science and Engineering (Springer, Berlin, Germany), 185–212.CrossrefGoogle Scholar
  • [18] Do V, Corbett-Davies S, Atif J, Usunier N (2021) Two-sided fairness in rankings via Lorenz dominance. Adv. Neural Inform. Processing Systems 34:8596–8608.Google Scholar
  • [19] Dunn JC, Harshbarger S (1978) Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2):432–444.CrossrefGoogle Scholar
  • [20] Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.CrossrefGoogle Scholar
  • [21] Garber D, Hazan E (2015) Faster rates for the Frank-Wolfe method over strongly-convex sets. Proc. 32nd Internat. Conf. Machine Learn. (PMLR, New York), 541–549.Google Scholar
  • [22] Garber D, Meshi O (2016) Linear-memory and decomposition-invariant linearly convergent conditional gradient algorithm for structured polytopes. Adv. Neural Inform. Processing Systems 29:1001–1009.Google Scholar
  • [23] Guélat J, Marcotte P (1986) Some comments on Wolfe’s “away step.” Math. Programming 35(1):110–119.CrossrefGoogle Scholar
  • [24] Hazan E (2008) Sparse approximate solutions to semidefinite programs. Proc. Eighth Latin Amer. Sympos. Theoretical Informatics, 306–316.Google Scholar
  • [25] Jaggi M (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. Proc. 30th Internat. Conf. Machine Learn. (PMLR, New York), 427–435.Google Scholar
  • [26] Joulin A, Tang K, Fei-Fei L (2014) Efficient image and video co-localization with Frank-Wolfe algorithm. Eur. Conf. Comput. Vision, 253–268.Google Scholar
  • [27] Kerdreux T, d’Aspremont A, Pokutta S (2021) Projection-free optimization on uniformly convex sets. Proc. 24th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 19–27.Google Scholar
  • [28] Kurdyka K (1998) On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier (Grenoble) 48(3):769–783.CrossrefGoogle Scholar
  • [29] Lacoste-Julien S, Jaggi M (2015) On the global linear convergence of Frank-Wolfe optimization variants. Adv. Neural Inform. Processing Systems 28:496–504.Google Scholar
  • [30] Lacoste-Julien S, Jaggi M, Schmidt M, Pletscher P (2013) Block-coordinate Frank-Wolfe optimization for structural SVMs. Proc. 30th Internat. Conf. Machine Learn. (PMLR, New York), 53–61.Google Scholar
  • [31] Lan G (2013) The complexity of large-scale convex programming under a linear optimization oracle. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL.Google Scholar
  • [32] LeBlanc LJ, Morlok EK, Pierskalla WP (1975) An efficient approach to solving the road network equilibrium traffic assignment problem. Transportation Res. 9(5):309–318.CrossrefGoogle Scholar
  • [33] Levitin ES, Polyak BT (1966) Constrained minimization methods. USSR Comput. Math. Math. Phys. 6(5):1–50.CrossrefGoogle Scholar
  • [34] Li B, Sadeghi A, Giannakis G (2021) Heavy ball momentum for conditional gradient. Adv. Neural Inform. Processing Systems 34:21244–21255.Google Scholar
  • [35] Łojasiewicz S (1963) Une propriété topologique des sous-ensembles analytiques réels. Les Équations aux Dérivées Partielles (Éditions du Centre National de la Recherche Scientifique, Paris), 87–89.Google Scholar
  • [36] Luise G, Salzo S, Pontil M, Ciliberto C (2019) Sinkhorn barycenters with free support via Frank-Wolfe algorithm. Adv. Neural Inform. Processing Systems 32:9322–9333.Google Scholar
  • [37] Schneider R (2013) Convex Bodies: The Brunn-Minkowski Theory, 2nd ed. (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.