AdaBB: Adaptive Barzilai-Borwein Method for Convex Optimization

Published Online:https://doi.org/10.1287/moor.2024.0510

References

  • [1] Altschuler JM, Parrilo PA (2023) Acceleration by stepsize hedging I: Multi-step descent and the silver stepsize schedule. Preprint, submitted September 14, https://arxiv.org/abs/2309.07879.Google Scholar
  • [2] Altschuler JM, Parrilo PA (2023) Acceleration by stepsize hedging II: Silver stepsize schedule for smooth convex optimization. Preprint, submitted September 28, https://arxiv.org/abs/2309.16530.Google Scholar
  • [3] Armijo L (1966) Minimization of functions having Lipschitz continuous first partial derivatives. Pacific J. Math. 16:1–3.CrossrefGoogle Scholar
  • [4] Barzilai J, Borwein JM (1988) Two-point step size gradient methods. IMA J. Numer. Anal. 8(1):141–148.CrossrefGoogle Scholar
  • [5] Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.CrossrefGoogle Scholar
  • [6] Burdakov O, Yh D, Huang N (2019) Stabilized Barzilai-Borwein method. J. Comput. Math. 37(6):916–936.CrossrefGoogle Scholar
  • [7] Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Trans. Intelligent Systems Tech. 2(3):27.CrossrefGoogle Scholar
  • [8] Dai Y-H, Liao L-Z (2002) R-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal. 22(1):1–10.CrossrefGoogle Scholar
  • [9] Das Gupta S, Van Parys BP, Ryu EK (2024) Branch-and-bound performance estimation programming: A unified methodology for constructing optimal optimization methods. Math. Program. 204:567–639.CrossrefGoogle Scholar
  • [10] di Serafino D, Ruggiero V, Toraldo G, Zanni L (2018) On the steplength selection in gradient methods for unconstrained optimization. Appl. Math. Comput. 318:176–195.Google Scholar
  • [11] Drori Y (2018) On the properties of convex functions over open sets. Preprint, submitted December 30, https://arxiv.org/abs/1812.02419.Google Scholar
  • [12] Drori Y, Teboulle M (2014) Performance of first-order methods for smooth convex minimization: A novel approach. Math. Program. 145(1–2):451–482.CrossrefGoogle Scholar
  • [13] Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J. Machine Learn. Res. 12:2121–2159.Google Scholar
  • [14] Ghadimi S, Lan G, Zhang H (2019) Generalized uniformly optimal methods for nonlinear programming. J. Sci. Comput. 79(3):1854–1881.CrossrefGoogle Scholar
  • [15] Grimmer B (2023) Provably faster gradient descent via long steps. Preprint, submitted July 12, https://arxiv.org/abs/2307.06324.Google Scholar
  • [16] Grimmer B, Shu K, Wang AL (2023) Accelerated gradient descent via long steps. Preprint, submitted September 27, https://arxiv.org/abs/2309.09961.Google Scholar
  • [17] Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J. Numer. Anal. 23(4):707–716.CrossrefGoogle Scholar
  • [18] Lan G, Ouyang Y, Zhang Z (2023) Optimal and parameter-free gradient minimization methods for convex and nonconvex optimization. Preprint, submitted October 18, https://arxiv.org/abs/2310.12139.Google Scholar
  • [19] Latafat P, Themelis A, Patrinos P (2023) On the convergence of adaptive first order methods: Proximal gradient and alternating minimization algorithms. Preprint, submitted November 30, https://arxiv.org/abs/2311.18431.Google Scholar
  • [20] Latafat P, Themelis A, Stella L, Patrinos P (2023) Adaptive proximal algorithms for convex optimization under local Lipschitz continuity of the gradient. Preprint, submitted January 11, https://arxiv.org/abs/2301.04431.Google Scholar
  • [21] Li T, Lan G (2023) A simple uniformly optimal method without line search for convex optimization. Preprint, submitted October 16, https://arxiv.org/abs/2310.10082.Google Scholar
  • [22] Liang J, Monteiro RDC (2021) An average curvature accelerated composite gradient method for nonconvex smooth composite optimization problems. SIAM J. Optim. 31(1):217–243.CrossrefGoogle Scholar
  • [23] Liang J, Monteiro RDC (2023) Average curvature FISTA for nonconvex smooth composite optimization problems. Comput. Optim. Appl. 86(1):275–302.CrossrefGoogle Scholar
  • [24] Liang J, Monteiro RDC, Sim CK (2021) A FISTA-type accelerated gradient algorithm for solving smooth nonconvex composite optimization problems. Comput. Optim. Appl. 79(3):649–679.CrossrefGoogle Scholar
  • [25] Malitsky Y, Mishchenko K (2020) Adaptive gradient descent without descent. Proc. 37th Internat. Conf. Machine Learn. (ACM, New York).Google Scholar
  • [26] Malitsky Y, Mishchenko K (2024) Adaptive proximal gradient method for convex optimization. 38th Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY).Google Scholar
  • [27] McMahan HB, Streeter MJ (2010) Adaptive bound optimization for online convex optimization. Proc. 23rd Annu. Conf. Learn. Theory 2010.Google Scholar
  • [28] Nesterov Y (2004) Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87 (Kluwer Academic Publishers, Boston).CrossrefGoogle Scholar
  • [29] Nesterov Y (2013) Gradient methods for minimizing composite functions. Math. Program. 140(1):125–161.CrossrefGoogle Scholar
  • [30] Nesterov Y, Polyak BT (2006) Cubic regularization of Newton method and its global performance. Math. Program. 108(1):177–205.CrossrefGoogle Scholar
  • [31] Pronzato L, Zhigljavsky A, Bukina E (2013) Estimation of spectral bounds in gradient algorithms. Acta Appl. Math. 127:117–136.CrossrefGoogle Scholar
  • [32] Raydan M (1993) On the Barzilai and Borwein choice of steplength for the gradient method. IMA J. Numer. Anal. 13(3):321–326.CrossrefGoogle Scholar
  • [33] Raydan M (1997) The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7(1):26–33.CrossrefGoogle Scholar
  • [34] Tan C, Ma S, Dai YH, Qian Y (2016) Barzilai-Borwein step size for stochastic gradient descent. 30th Conf. Neural Inform. Processing Systems (Curran Associates, Inc., Red Hook, NY).Google Scholar
  • [35] Taylor AB, Hendrickx JM, Glineur F (2017) Exact worst-case performance of first-order methods for composite convex optimization. SIAM J. Optim. 27(3):1283–1313.CrossrefGoogle Scholar
  • [36] Taylor AB, Hendrickx JM, Glineur F (2017) Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1–2):307–345.CrossrefGoogle Scholar
  • [37] Teboulle M, Vaisbourd Y (2023) An elementary approach to tight worst case complexity analysis of gradient based methods. Math. Program. 201(1–2):63–96.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.