New Analysis of an Away-Step Frank–Wolfe Method for Minimizing Log-Homogeneous Barriers

Published Online:https://doi.org/10.1287/moor.2023.0281

References

  • [1] Ahipasaoglu SD, Sun P, Todd MJ (2008) Linear convergence of a modified Frank–Wolfe algorithm for computing minimum-volume enclosing ellipsoids. Optim. Methods Software 23(1):5–19.CrossrefGoogle Scholar
  • [2] Arimoto S (1972) An algorithm for computing the capacity of arbitrary discrete memoryless channels. IEEE Trans. Inform. Theory 18(1):14–20.CrossrefGoogle Scholar
  • [3] Atwood CL (1973) Sequences converging to D-optimal designs of experiments. Ann. Statist. 1(2):342–352.CrossrefGoogle Scholar
  • [4] Bauschke HH, Bolte J, Teboulle M (2017) A descent lemma beyond Lipschitz gradient continuity: First-order methods revisited and applications. Math. Oper. Res. 42(2):330–348.LinkGoogle Scholar
  • [5] Beck A, Shtern S (2017) Linearly convergent away-step conditional gradient for non-strongly convex functions. Math. Programming 164:1–27.CrossrefGoogle Scholar
  • [6] Canon MD, Cullum CD (1968) A tight upper bound on the rate of convergence of Frank–Wolfe algorithm. SIAM J. Control 6(4):509–516.CrossrefGoogle Scholar
  • [7] Carderera A, Besançon M, Pokutta S (2021) Simple steps are all you need: Frank–Wolfe and generalized self-concordant functions. 35th Conf. Neural Inform. Processing Systems (NeurIPS 2021) (Curran Associates Inc., Red Hook, NY), 5390–5401.Google Scholar
  • [8] Chambolle A, Ehrhardt MJ, Richtárik P, Schónlieb CB (2018) Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications. SIAM J. Optim. 28(4):2783–2808.CrossrefGoogle Scholar
  • [9] Cover T (1984) An algorithm for maximizing expected log investment return. IEEE Trans. Inform. Theory 30(2):369–373.CrossrefGoogle Scholar
  • [10] Demyanov V, Rubinov A (1967) The minimization of a smooth convex functional on a convex set. SIAM J. Control 5(2):280–294.CrossrefGoogle Scholar
  • [11] Dunn J (1980) Convergence rates for conditional gradient sequences generated by implicit step length rules. SIAM J. Control Optim. 18(5):473–487.CrossrefGoogle Scholar
  • [12] Dvurechensky P, Safin K, Shtern S, Staudigl M (2023) Generalized self-concordant analysis of Frank–Wolfe algorithms. Math. Programming 198:255–323.CrossrefGoogle Scholar
  • [13] Dvurechensky P, Ostroukhov P, Safin K, Shtern S, Staudigl M (2020) Self-concordant analysis of Frank–Wolfe algorithms. Proc. ICML 119:2814–2824.Google Scholar
  • [14] Fedorov VV (1972) Theory of Optimal Experiments (Academic Press, New York).Google Scholar
  • [15] Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res. Logist. 3(1–2):95–110.CrossrefGoogle Scholar
  • [16] Freund RM, Grigas P (2016) New analysis and results for the Frank–Wolfe method. Math. Programming 155:199–230.CrossrefGoogle Scholar
  • [17] Freund RM, Grigas P, Mazumder R (2017) An extended Frank–Wolfe method with “in-face” directions, and its application to low-rank matrix completion. SIAM J. Optim. 27(1):319–346.CrossrefGoogle Scholar
  • [18] Garber D (2020) Revisiting Frank–Wolfe for polytopes: Strict complementarity and sparsity. 34th Conf. Neural Inform. Processing Systems (NeurIPS 2020) (Curran Associates Inc., Red Hook, NY), 18883–18893.Google Scholar
  • [19] Ghadimi S (2019) Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Programming 173:431–464.CrossrefGoogle Scholar
  • [20] Guélat J, Marcotte P (1986) Some comments on Wolfe’s “away step.” Math. Programming 35:110–119.CrossrefGoogle Scholar
  • [21] Harchaoui Z, Juditsky A, Nemirovski A (2015) Conditional gradient algorithms for norm-regularized smooth convex optimization. Math. Programming 152:75–112.CrossrefGoogle Scholar
  • [22] Harmany ZT, Marcia RF, Willett RM (2012) This is spiral-tap: Sparse Poisson intensity reconstruction algorithms—Theory and practice. IEEE Trans. Image Processing 21(3):1084–1096.CrossrefGoogle Scholar
  • [23] Hawkes AG (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1):83–90.CrossrefGoogle Scholar
  • [24] Jaggi M (2013) Revisiting Frank–Wolfe: Projection-free sparse convex optimization. Proc. ICML 28(1):427–435.Google Scholar
  • [25] John F (1948) Extremum problems with inequalities as subsidiary conditions. Studies and Essays, Presented to R. Courant on His 60th Birthday (Interscience, New York), 187–204.Google Scholar
  • [26] Khachiyan LG (1996) Rounding of polytopes in the real number model of computation. Math. Oper. Res. 21(2):307–320.LinkGoogle Scholar
  • [27] Lacoste-Julien S, Jaggi M (2015) On the global linear convergence of Frank–Wolfe optimization variants. Advances in Neural Information Processing Systems, vol. 28 (Curran Associates, Inc., Red Hook, NY), 496–504.Google Scholar
  • [28] Levitin E, Polyak B (1966) Constrained minimization methods. USSR Comput. Math. Math. Phys. 6(5):1–50.CrossrefGoogle Scholar
  • [29] Lu H, Freund RM, Nesterov Y (2018) Relatively smooth convex optimization by first-order methods, and applications. SIAM J. Optim. 28(1):333–354.CrossrefGoogle Scholar
  • [30] Luo Z, Tseng P (1992) On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 72:7–35.CrossrefGoogle Scholar
  • [31] Morse S (2020) Python class for multivariate Hawkes processes. https://github.com/stmorse/hawkes/.Google Scholar
  • [32] Nesterov Y (2004) Introductory Lectures on Convex Optimization: A Basic Course (Springer, New York).CrossrefGoogle Scholar
  • [33] Nesterov Y (2013) Gradient methods for minimizing composite functions. Math. Programming 140(1):125–161.CrossrefGoogle Scholar
  • [34] Nesterov Y (2015) Universal gradient methods for convex optimization problems. Math. Programming 152:381–404.CrossrefGoogle Scholar
  • [35] Nesterov Y (2018) Complexity bounds for primal-dual methods minimizing the model of objective function. Math. Programming 171:311–330.CrossrefGoogle Scholar
  • [36] Nesterov Y, Nemirovskii A (1994) Interior-Point Polynomial Algorithms in Convex Programming (SIAM, Philadelphia).CrossrefGoogle Scholar
  • [37] Peña J (2023) Affine invariant convergence rates of the conditional gradient method. SIAM J. Optim. 33(4):2654–2674.CrossrefGoogle Scholar
  • [38] Peña J, Rodríguez D (2019) Polytope conditioning and linear convergence of the Frank–Wolfe algorithm. Math. Oper. Res. 44(1):1–18.AbstractGoogle Scholar
  • [39] Renegar J (2001) A Mathematical View of Interior-Point Methods in Convex Optimization (SIAM, Philadelphia).CrossrefGoogle Scholar
  • [40] Rockafellar RT (1970) Convex Analysis (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • [41] Rockafellar RT, Wets RJB (1998) Variational Analysis (Springer, Berlin).CrossrefGoogle Scholar
  • [42] Silvey S, Titterington D, Torsney B (1978) An algorithm for optimal designs on a design space. Comm. Statist. Theory Methods 7(14):1379–1389.CrossrefGoogle Scholar
  • [43] Stonyakin F, Tyurin A, Gasnikov A, Dvurechensky P, Agafonov A, Dvinskikh D, Alkousa M, Pasechnyuk D, Artamonov S, Piskunova V (2021) Inexact model: A framework for optimization and variational inequalities. Optim. Methods Software 36(6):1155–1201.CrossrefGoogle Scholar
  • [44] Sun T, Tran-Dinh Q (2019) Generalized self-concordant functions: A recipe for Newton-type methods. Math. Programming. 178:145–213.CrossrefGoogle Scholar
  • [45] Todd MJ (2016) Minimum-Volume Ellipsoids: Theory and Algorithms (SIAM, Philadelphia).CrossrefGoogle Scholar
  • [46] Todd MJ, Yıldırım EA (2007) On Khachiyan’s algorithm for the computation of minimum-volume enclosing ellipsoids. Discrete Appl. Math. 155(13):1731–1744.CrossrefGoogle Scholar
  • [47] Vardi Y, Lee D (1993) From image deblurring to optimal investments: Maximum likelihood solutions for positive linear inverse problems. J. Roy. Statist. Soc. Ser. B Methodological 55(3):569–598.CrossrefGoogle Scholar
  • [48] Wolfe P (1970) Convergence theory in nonlinear programming. Integer and Nonlinear Programming (North-Holland, Amsterdam).Google Scholar
  • [49] Yu HF, Hsieh CJ, Lei Q, Dhillon IS (2017) A greedy approach for budgeted maximum inner product search. Advances in Neural Information Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY).Google Scholar
  • [50] Zhao R (2022) The generalized multiplicative gradient method and its convergence rate analysis. Preprint, submitted July 26, https://arxiv.org/abs/2207.13198.Google Scholar
  • [51] Zhao R, Freund RM (2023) Analysis of the Frank–Wolfe method for convex composite optimization involving a logarithmically-homogeneous barrier. Math. Programming 199(1–2):123–163.CrossrefGoogle Scholar
  • [52] Zhao R, Dalmasso N, Ghassemi M, Potluru VK, Balch T, Veloso M (2022) Fast learning of multidimensional Hawkes processes via Frank–Wolfe. NeurIPS 2022 Workshop Synthetic Data Empowering ML Res.Google Scholar
  • [53] Zhou K, Zha H, Song L (2013) Learning social infectivity in sparse low-rank networks using multi-dimensional Hawkes processes. Proc. AISTATS. (PMLR, New York), 641–649.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.