First- and Second-Order Stochastic Adaptive Regularization with Cubics: High-Probability Iteration and Sample Complexity

Published Online:https://doi.org/10.1287/ijoo.2025.0123

References

  • Bandeira AS, Scheinberg K, Vicente LN (2014) Convergence of trust-region methods based on probabilistic models. SIAM J. Optim. 24(3):1238–1264.Google Scholar
  • Bellavia S, Gurioli G (2022) Stochastic analysis of an adaptive cubic regularization method under inexact gradient evaluations and dynamic Hessian accuracy. Optimization 71(1):227–261.Google Scholar
  • Bellavia S, Gurioli G, Morini B, Toint PL (2019) Adaptive regularization algorithms with inexact evaluations for nonconvex optimization. SIAM J. Optim. 29(4):2881–2915.Google Scholar
  • Bellavia S, Gurioli G, Morini B, Toint PL (2020) A stochastic cubic regularisation method with inexact function evaluations and random derivatives for finite sum minimisation. 37th Internat. Conf. Machine Learn. (ICML2020).Google Scholar
  • Bellavia S, Gurioli G, Morini B, Toint PL (2022) Adaptive regularization for nonconvex optimization using inexact function values and randomly perturbed derivatives. J. Complexity 68:101591.Google Scholar
  • Berahas AS, Xie M, Zhou B (2025) A sequential quadratic programming method with high probability complexity bounds for nonlinear equality constrained stochastic optimization. SIAM J. Optim. 35(1):240–269.Google Scholar
  • Blanchet J, Cartis C, Menickelly M, Scheinberg K (2019) Convergence rate analysis of a stochastic trust-region method via supermartingales. INFORMS J. Optim. 1(2):92–119.LinkGoogle Scholar
  • Cao L, Berahas AS, Scheinberg K (2024) First-and second-order high probability complexity bounds for trust-region methods with noisy oracles. Math. Programming 207(1):55–106.Google Scholar
  • Carmon Y, Duchi J (2019) Gradient descent finds the cubic-regularized nonconvex Newton step. SIAM J. Optim. 29(3):2146–2178.Google Scholar
  • Cartis C, Scheinberg K (2018) Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Math. Programming 169(2):337–375.Google Scholar
  • Cartis C, Gould N, Toint PL (2011a) Optimal Newton-type methods for nonconvex smooth optimization problems. Preprint, submitted June 22, https://optimization-online.org/2011/06/3070/.Google Scholar
  • Cartis C, Gould NIM, Toint PL (2011b) Adaptive cubic regularisation methods for unconstrained optimization. Part I: Motivation, convergence and numerical results. Math. Programming 127(2):245–295.Google Scholar
  • Cartis C, Gould NIM, Toint PL (2011c) Adaptive cubic regularisation methods for unconstrained optimization. Part II: Worst-case function- and derivative-evaluation complexity. Math. Programming 130(2):295–319.Google Scholar
  • Cartis C, Shao Z, Tansley E (2025) Random subspace cubic-regularization methods, with applications to low-rank functions. Preprint, submitted January 16, https://arxiv.org/abs/2501.09734.Google Scholar
  • Chen R, Menickelly M, Scheinberg K (2018) Stochastic optimization using a trust-region method and random models. Math. Programming 169(2):447–487.Google Scholar
  • Gratton S, Royer CW, Vicente LN, Zhang Z (2018) Complexity and global rates of trust-region methods based on probabilistic models. IMA J. Numer. Anal. 38(3):1579–1597.Google Scholar
  • Hanzely F, Doikov N, Nesterov Y, Richtarik P (2020) Stochastic subspace cubic Newton method. Internat. Conf. Machine Learn. (PMLR), 4027–4038.Google Scholar
  • Jin B, Scheinberg K, Xie M (2024) High probability complexity bounds for adaptive step search based on stochastic oracles. SIAM J. Optim. 34(3):2411–2439.Google Scholar
  • Jin B, Scheinberg K, Xie M (2025) Sample complexity analysis for adaptive optimization algorithms with stochastic oracles. Math. Programming 209(1):651–679.Google Scholar
  • Kim S, Pasupathy R, Henderson SG (2014) A guide to sample average approximation. Fu M, ed. Handbook of Simulation Optimization, International Series in Operations Research & Management Science, vol. 216 (Springer, New York), 207–243.Google Scholar
  • Kohler JM, Lucchi A (2017) Sub-sampled cubic regularization for non-convex optimization. Proc. 34th Internat. Conf. Machine Learn. (ICML 2017) (PMLR).Google Scholar
  • Liu L, Liu X, Hsieh CJ, Tao D (2018) Stochastic second-order methods for non-convex optimization with inexact Hessian and gradient. Preprint, submitted September 26, https://arxiv.org/abs/1809.09853.Google Scholar
  • Menickelly M, Wild SM, Xie M (2026) A stochastic quasi-newton method in the absence of common random numbers. J. Optim. Theory Appl. 208(2):84.Google Scholar
  • Moré JJ, Wild SM (2011) Estimating computational noise. SIAM J. Sci. Comput. 33(3):1292–1314.Google Scholar
  • Nesterov Y, Polyak BT (2006) Cubic regularization of newton method and its global performance. Math. Programming 108(1):177–205.Google Scholar
  • Paquette C, Scheinberg K (2020) A stochastic line search method with expected complexity analysis. SIAM J. Optim. 30(1):349–376.Google Scholar
  • Park S, Jung SH, Pardalos PM (2020) Combining stochastic adaptive cubic regularization with negative curvature for nonconvex optimization. J. Optim. Theory Appl. 184(3):953–971.Google Scholar
  • Scheinberg K, Xie M (2023) Stochastic adaptive regularization method with cubics: A high probability complexity bound. 2023 Winter Simulation Conf. (WSC) (IEEE, Piscataway, NJ), 3520–3531.Google Scholar
  • Tripuraneni N, Stern M, Jin C, Regier J, Jordan MI (2018) Stochastic cubic regularization for fast nonconvex optimization. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. NIPS’18: Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 2899–2908.Google Scholar
  • Vershynin R (2018) High-Dimensional Probability: An Introduction with Applications in Data Science, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 47 (Cambridge University Press, Cambridge, UK).Google Scholar
  • Wang Z, Zhou Y, Liang Y, Lan G (2019a) A note on inexact gradient and Hessian conditions for cubic regularized Newton’s method. Oper. Res. Lett. 47(2):146–149.Google Scholar
  • Wang Z, Zhou Y, Liang Y, Lan G (2019b) Stochastic variance-reduced cubic regularization for nonconvex optimization. 22nd Intern. Conf. Artificial Intelligence Statist. (PMLR, New York), 2731–2740.Google Scholar
  • Zhang J, Xiao L, Zhang S (2022) Adaptive stochastic variance reduction for subsampled Newton method with cubic regularization. INFORMS J. Optim. 4(1):45–64.LinkGoogle Scholar
  • Zhao J, Doikov N, Lucchi A (2025) Cubic regularized subspace Newton for non-convex optimization. 28th Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 811–819.Google Scholar
  • Zhou D, Gu Q (2020) Stochastic recursive variance-reduced cubic regularization methods. Proc. 23rd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 3980–3990.Google Scholar
  • Zhou D, Xu P, Gu Q (2018) Stochastic variance-reduced cubic regularized Newton methods. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 80 (PMLR, New York), 5990–5999.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.