Technical Note—On the Convergence Rate of Stochastic Approximation for Gradient-Based Stochastic Optimization
References
- (2022) A theoretical and empirical comparison of gradient approximations in derivative-free optimization. Found. Comput. Math. 22(2):507–560.Crossref, Google Scholar
- (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
- (2019) Stochastic gradient descent with biased but consistent gradient estimators. Preprint, submitted July 31, 2018, https://arxiv.org/abs/1807.11880.Google Scholar
- (2023) A guide through the zoo of biased SGD. Preprint, submitted May 25, https://arxiv.org/abs/2305.16296.Google Scholar
- (2022) On biased stochastic gradient estimation. J. Machine Learn. Res. 23(24):1057–1099.Google Scholar
- (2015) Optimal rates for zero-order convex optimization: The power of two function evaluations. IEEE Trans. Inform. Theory 61(5):2788–2806.Crossref, Google Scholar
- (2012) Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization I: A generic algorithmic framework. SIAM J. Optim. 22(4):1469–1492.Crossref, Google Scholar
- (2021) Analysis of biased stochastic gradient descent using sequential semidefinite programs. Math. Programming 187(1–2):383–408.Crossref, Google Scholar
- (2024) Quantile optimization via multiple timescale local search for black-box functions. Oper. Res. Forthcoming.Google Scholar
- (2019) Non-asymptotic analysis of biased stochastic approximation scheme. Proc. Machine Learn. Res. 99:1–31.Google Scholar
- (1952) Stochastic estimation of the maximum of a regression function. Ann. Math. Statist. 23(3):462–466.Crossref, Google Scholar
- (1997) Stochastic Approximation and Recursive Algorithms and Applications (Springer, New York).Google Scholar
- (1996) Linear System Theory, 2nd ed. (Prentice Hall, Upper Saddle River, NJ).Google Scholar
- (2022) Finite difference gradient approximation: To randomize or not? INFORMS J. Comput. 34(5):2384–2388.Link, Google Scholar
- (1992) Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Automat. Control 37(3):332–341.Crossref, Google Scholar

