The Iterates of the Frank–Wolfe Algorithm May Not Converge
Published Online:7 Dec 2023https://doi.org/10.1287/moor.2022.0057
References
- [1] (2013) Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Programming 137(1):91–129.Crossref, Google Scholar
- [2] (2010) Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2):438–457.Link, Google Scholar
- [3] (2017) First-Order Methods in Optimization (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
- [4] (1999) Nonlinear Programming, 2nd ed. (Athena Scientific, Nashua, NH).Google Scholar
- [5] (2022) Curiosities and counterexamples in smooth convex optimization. Math. Programming 195(1–2):553–603.Crossref, Google Scholar
- [6] (2007) The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4):1205–1223.Crossref, Google Scholar
- [7] (2014) Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.Crossref, Google Scholar
- [8] (2007) Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2):556–572.Crossref, Google Scholar
- [9] (2010) Characterizations of Łojasiewicz inequalities and applications. Trans. Amer. Math. Soc. 362(6):3319–3363.Crossref, Google Scholar
- [10] (2017) From error bounds to the complexity of first-order descent methods for convex functions. Math. Programming 165(2):471–507.Crossref, Google Scholar
- [11] (2019) Blended conditional gradients: The unconditioning of conditional gradients. Proc. 36th Internat. Conf. Machine Learn. (PMLR, New York), 735–743.Google Scholar
- [12] (1968) A tight upper bound on the rate of convergence of Frank-Wolfe algorithm. SIAM J. Control 6(4):509–516.Crossref, Google Scholar
- [13] (2010) Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Trans. Algorithms 6(4):1–30.Crossref, Google Scholar
- [14] (2020) Boosting Frank-Wolfe by chasing gradients. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 2111–2121.Google Scholar
- [15] (2021) Complexity of linear minimization and projection on some sets. Oper. Res. Lett. 49(4):565–571.Crossref, Google Scholar
- [16] (2023) Revisiting the approximate Carathéodory problem via the Frank-Wolfe algorithm. Math. Programming 197(1):191–214.Crossref, Google Scholar
- [17] (2011) Proximal splitting methods in signal processing. Fixed-Point Algorithms for Inverse Problems in Science and Engineering (Springer, Berlin, Germany), 185–212.Crossref, Google Scholar
- [18] (2021) Two-sided fairness in rankings via Lorenz dominance. Adv. Neural Inform. Processing Systems 34:8596–8608.Google Scholar
- [19] (1978) Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2):432–444.Crossref, Google Scholar
- [20] (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.Crossref, Google Scholar
- [21] (2015) Faster rates for the Frank-Wolfe method over strongly-convex sets. Proc. 32nd Internat. Conf. Machine Learn. (PMLR, New York), 541–549.Google Scholar
- [22] (2016) Linear-memory and decomposition-invariant linearly convergent conditional gradient algorithm for structured polytopes. Adv. Neural Inform. Processing Systems 29:1001–1009.Google Scholar
- [23] (1986) Some comments on Wolfe’s “away step.” Math. Programming 35(1):110–119.Crossref, Google Scholar
- [24] (2008) Sparse approximate solutions to semidefinite programs. Proc. Eighth Latin Amer. Sympos. Theoretical Informatics, 306–316.Google Scholar
- [25] (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. Proc. 30th Internat. Conf. Machine Learn. (PMLR, New York), 427–435.Google Scholar
- [26] (2014) Efficient image and video co-localization with Frank-Wolfe algorithm. Eur. Conf. Comput. Vision, 253–268.Google Scholar
- [27] (2021) Projection-free optimization on uniformly convex sets. Proc. 24th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 19–27.Google Scholar
- [28] (1998) On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier (Grenoble) 48(3):769–783.Crossref, Google Scholar
- [29] (2015) On the global linear convergence of Frank-Wolfe optimization variants. Adv. Neural Inform. Processing Systems 28:496–504.Google Scholar
- [30] (2013) Block-coordinate Frank-Wolfe optimization for structural SVMs. Proc. 30th Internat. Conf. Machine Learn. (PMLR, New York), 53–61.Google Scholar
- [31] (2013) The complexity of large-scale convex programming under a linear optimization oracle. Technical report, Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL.Google Scholar
- [32] (1975) An efficient approach to solving the road network equilibrium traffic assignment problem. Transportation Res. 9(5):309–318.Crossref, Google Scholar
- [33] (1966) Constrained minimization methods. USSR Comput. Math. Math. Phys. 6(5):1–50.Crossref, Google Scholar
- [34] (2021) Heavy ball momentum for conditional gradient. Adv. Neural Inform. Processing Systems 34:21244–21255.Google Scholar
- [35] (1963) Une propriété topologique des sous-ensembles analytiques réels. Les Équations aux Dérivées Partielles (Éditions du Centre National de la Recherche Scientifique, Paris), 87–89.Google Scholar
- [36] (2019) Sinkhorn barycenters with free support via Frank-Wolfe algorithm. Adv. Neural Inform. Processing Systems 32:9322–9333.Google Scholar
- [37] (2013) Convex Bodies: The Brunn-Minkowski Theory, 2nd ed. (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar

