Unbiased Least Squares Regression via Averaged Stochastic Gradient Descent

Published Online:https://doi.org/10.1287/moor.2024.0660

References

  • [1] Agapiou S, Roberts GO, Vollmer SJ (2018) Unbiased Monte Carlo: Posterior estimation for intractable/infinite-dimensional models. Bernoulli 24(3):1726–1786.CrossrefGoogle Scholar
  • [2] Asi H, Carmon Y, Jambulapati A, Jin Y, Sidford A (2021) Stochastic bias-reduced gradient methods. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 10810–10822.Google Scholar
  • [3] Asmussen S, Glynn PW (2007) Stochastic Simulation: Algorithms and Analysis, Stochastic Modelling and Applied Probability, vol. 57 (Springer Science & Business Media, New York).CrossrefGoogle Scholar
  • [4] Bach F, Moulines E (2013) Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n). Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KO, eds. Proc. 27th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 773–781.Google Scholar
  • [5] Blanchet JH, Glynn PW, Pei Y (2019) Unbiased multilevel Monte Carlo: Stochastic optimization, steady-state simulation, quantiles, and other applications. Preprint, submitted April 22, https://arxiv.org/abs/1904.09929.Google Scholar
  • [6] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.CrossrefGoogle Scholar
  • [7] Chada NK, Franks J, Jasra A, Law KJ, Vihola M (2021) Unbiased inference for discretely observed hidden Markov model diffusions. SIAM/ASA J. Uncertainty Quantification 9(2):763–787.CrossrefGoogle Scholar
  • [8] Cui Z, Fu MC, Peng Y, Zhu L (2020) Optimal unbiased estimation for expected cumulative discounted cost. Eur. J. Oper. Res. 286(2):604–618.CrossrefGoogle Scholar
  • [9] Défossez A, Bach F (2015) Averaged least-mean-squares: Bias-variance trade-offs and optimal sampling distributions. Lebanon G, Vishwanathan SVN, eds. Proc. 18th Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 38 (PMLR, New York), 205–213.Google Scholar
  • [10] Dieuleveut A, Bach F (2016) Nonparametric stochastic approximation with large step-sizes. Ann. Statist. 44(4):1363–1399.CrossrefGoogle Scholar
  • [11] Dieuleveut A, Flammarion N, Bach F (2017) Harder, better, faster, stronger convergence rates for least-squares regression. J. Machine Learn. Res. 18(101):1–51.Google Scholar
  • [12] Giles MB (2008) Multilevel Monte Carlo path simulation. Oper. Res. 56(3):607–617.LinkGoogle Scholar
  • [13] Glasserman P (2004) Monte Carlo Methods in Financial Engineering (Springer, New York).CrossrefGoogle Scholar
  • [14] Glynn PW, Rhee C (2014) Exact estimation for Markov chain equilibrium expectations. J. Appl. Probab. 51(A):377–389.CrossrefGoogle Scholar
  • [15] Glynn PW, Whitt W (1992) The asymptotic efficiency of simulation estimators. Oper. Res. 40(3):505–520.LinkGoogle Scholar
  • [16] Golub GH, Van Loan CF (2013) Matrix Computations, 4th ed. (JHU Press, Baltimore).CrossrefGoogle Scholar
  • [17] Gower RM, Schmidt M, Bach F, Richtarik P (2020) Variance-reduced methods for machine learning. Proc. IEEE 108(11):1968–1983.CrossrefGoogle Scholar
  • [18] Han R, Luo L, Lin Y, Huang J (2024) Online inference with debiased stochastic gradient descent. Biometrika 111(1):93–108.CrossrefGoogle Scholar
  • [19] Heng J, Houssineau J, Jasra A (2024) On unbiased estimation for partially observed diffusions. J. Machine Learn. Res. 25(66):1–66.Google Scholar
  • [20] Hu J, Fu MC (2025) On the convergence rate of stochastic approximation for gradient-based stochastic optimization. Oper. Res. 73(2):1143–1150.LinkGoogle Scholar
  • [21] Jain P, Kakade SM, Kidambi R, Netrapalli P, Sidford A (2018) Accelerating stochastic gradient descent for least squares regression. Bubeck S, Perchet V, Rigollet P, eds. 31st Annual Conf. Learn. Theory, Proceedings of the Machine Learning Research, vol. 75 (PMLR, New York), 545–604.Google Scholar
  • [22] Jain P, Kakade SM, Kidambi R, Netrapalli P, Sidford A (2018) Parallelizing stochastic gradient descent for least squares regression: Mini-batching, averaging, and model misspecification. J. Machine Learn. Res. 18(223):1–42.Google Scholar
  • [23] Kahalé N (2020) General multilevel Monte Carlo methods for pricing discretely monitored Asian options. Eur. J. Oper. Res. 287(2):739–748.CrossrefGoogle Scholar
  • [24] Kahalé N (2024) Unbiased time-average estimators for Markov chains. Math. Oper. Res. 49(4):2136–2165.LinkGoogle Scholar
  • [25] McLeish D (2011) A general method for debiasing a Monte Carlo estimator. Monte Carlo Methods Appl. 17(4):301–315.CrossrefGoogle Scholar
  • [26] Middleton L, Deligiannidis G, Doucet A, Jacob PE (2020) Unbiased Markov chain Monte Carlo for intractable target distributions. Electronic J. Statist. 14(2):2842–2891.CrossrefGoogle Scholar
  • [27] Mourtada J (2022) Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices. Ann. Statist. 50(4):2157–2178.CrossrefGoogle Scholar
  • [28] Needell D, Srebro N, Ward R (2016) Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. Math. Programming 155(1):549–573.CrossrefGoogle Scholar
  • [29] Polyak BT, Juditsky AB (1992) Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4):838–855.CrossrefGoogle Scholar
  • [30] Rhee C, Glynn PW (2015) Unbiased estimation with square root convergence for SDE models. Oper. Res. 63(5):1026–1043.LinkGoogle Scholar
  • [31] Ruppert D (1988) Efficient estimations from a slowly convergent Robbins-Monro process. Technical report, Cornell University Operations Research and Industrial Engineering, Ithaca, NY.Google Scholar
  • [32] Sportisse A, Boyer C, Dieuleveut A, Josse J (2020) Debiasing averaged stochastic gradient descent to handle missing values. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 12957–12967.Google Scholar
  • [33] Varre AV, Pillaud-Vivien L, Flammarion N (2021) Last iterate convergence of SGD for least-squares in the interpolation regime. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Proc. 35th Conf. Neural Inform. Processing Systems (NeurIPS 2021) (Curran Associates, Red Hook, NY), 21581–21591.Google Scholar
  • [34] Vihola M (2018) Unbiased estimators and multilevel Monte Carlo. Oper. Res. 66(2):448–462.LinkGoogle Scholar
  • [35] Wang T, Wang G (2023) Unbiased multilevel Monte Carlo methods for intractable distributions: MLMC meets MCMC. J. Machine Learn. Res. 24(249):1–40.Google Scholar
  • [36] Zhou Z, Mertikopoulos P, Bambos N, Glynn P, Ye Y (2022) Distributed stochastic optimization with large delays. Math. Oper. Res. 47(3):2082–2111.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.