Unbiased Least Squares Regression via Averaged Stochastic Gradient Descent

Nabil Kahalé
Nabil Kahalé
[email protected]
https://orcid.org/0000-0002-4009-6815
ESCP Business School, 75017 Paris, France
Search for more papers by this author

ESCP Business School, 75017 Paris, France

Search for more papers by this author

Published Online:14 Nov 2025https://doi.org/10.1287/moor.2024.0660

References

[1] Agapiou S, Roberts GO, Vollmer SJ (2018) Unbiased Monte Carlo: Posterior estimation for intractable/infinite-dimensional models. Bernoulli 24(3):1726–1786.Crossref, Google Scholar
[2] Asi H, Carmon Y, Jambulapati A, Jin Y, Sidford A (2021) Stochastic bias-reduced gradient methods. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Proc. 35th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 10810–10822.Google Scholar
[3] Asmussen S, Glynn PW (2007) Stochastic Simulation: Algorithms and Analysis, Stochastic Modelling and Applied Probability, vol. 57 (Springer Science & Business Media, New York).Crossref, Google Scholar
[4] Bach F, Moulines E (2013) Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n). Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KO, eds. Proc. 27th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 773–781.Google Scholar
[5] Blanchet JH, Glynn PW, Pei Y (2019) Unbiased multilevel Monte Carlo: Stochastic optimization, steady-state simulation, quantiles, and other applications. Preprint, submitted April 22, https://arxiv.org/abs/1904.09929.Google Scholar
[6] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
[7] Chada NK, Franks J, Jasra A, Law KJ, Vihola M (2021) Unbiased inference for discretely observed hidden Markov model diffusions. SIAM/ASA J. Uncertainty Quantification 9(2):763–787.Crossref, Google Scholar
[8] Cui Z, Fu MC, Peng Y, Zhu L (2020) Optimal unbiased estimation for expected cumulative discounted cost. Eur. J. Oper. Res. 286(2):604–618.Crossref, Google Scholar
[9] Défossez A, Bach F (2015) Averaged least-mean-squares: Bias-variance trade-offs and optimal sampling distributions. Lebanon G, Vishwanathan SVN, eds. Proc. 18th Internat. Conf. Artificial Intelligence Statist., Proceedings of the Machine Learning Research, vol. 38 (PMLR, New York), 205–213.Google Scholar
[10] Dieuleveut A, Bach F (2016) Nonparametric stochastic approximation with large step-sizes. Ann. Statist. 44(4):1363–1399.Crossref, Google Scholar
[11] Dieuleveut A, Flammarion N, Bach F (2017) Harder, better, faster, stronger convergence rates for least-squares regression. J. Machine Learn. Res. 18(101):1–51.Google Scholar
[12] Giles MB (2008) Multilevel Monte Carlo path simulation. Oper. Res. 56(3):607–617.Link, Google Scholar
[13] Glasserman P (2004) Monte Carlo Methods in Financial Engineering (Springer, New York).Crossref, Google Scholar
[14] Glynn PW, Rhee C (2014) Exact estimation for Markov chain equilibrium expectations. J. Appl. Probab. 51(A):377–389.Crossref, Google Scholar
[15] Glynn PW, Whitt W (1992) The asymptotic efficiency of simulation estimators. Oper. Res. 40(3):505–520.Link, Google Scholar
[16] Golub GH, Van Loan CF (2013) Matrix Computations, 4th ed. (JHU Press, Baltimore).Crossref, Google Scholar
[17] Gower RM, Schmidt M, Bach F, Richtarik P (2020) Variance-reduced methods for machine learning. Proc. IEEE 108(11):1968–1983.Crossref, Google Scholar
[18] Han R, Luo L, Lin Y, Huang J (2024) Online inference with debiased stochastic gradient descent. Biometrika 111(1):93–108.Crossref, Google Scholar
[19] Heng J, Houssineau J, Jasra A (2024) On unbiased estimation for partially observed diffusions. J. Machine Learn. Res. 25(66):1–66.Google Scholar
[20] Hu J, Fu MC (2025) On the convergence rate of stochastic approximation for gradient-based stochastic optimization. Oper. Res. 73(2):1143–1150.Link, Google Scholar
[21] Jain P, Kakade SM, Kidambi R, Netrapalli P, Sidford A (2018) Accelerating stochastic gradient descent for least squares regression. Bubeck S, Perchet V, Rigollet P, eds. 31st Annual Conf. Learn. Theory, Proceedings of the Machine Learning Research, vol. 75 (PMLR, New York), 545–604.Google Scholar
[22] Jain P, Kakade SM, Kidambi R, Netrapalli P, Sidford A (2018) Parallelizing stochastic gradient descent for least squares regression: Mini-batching, averaging, and model misspecification. J. Machine Learn. Res. 18(223):1–42.Google Scholar
[23] Kahalé N (2020) General multilevel Monte Carlo methods for pricing discretely monitored Asian options. Eur. J. Oper. Res. 287(2):739–748.Crossref, Google Scholar
[24] Kahalé N (2024) Unbiased time-average estimators for Markov chains. Math. Oper. Res. 49(4):2136–2165.Link, Google Scholar
[25] McLeish D (2011) A general method for debiasing a Monte Carlo estimator. Monte Carlo Methods Appl. 17(4):301–315.Crossref, Google Scholar
[26] Middleton L, Deligiannidis G, Doucet A, Jacob PE (2020) Unbiased Markov chain Monte Carlo for intractable target distributions. Electronic J. Statist. 14(2):2842–2891.Crossref, Google Scholar
[27] Mourtada J (2022) Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices. Ann. Statist. 50(4):2157–2178.Crossref, Google Scholar
[28] Needell D, Srebro N, Ward R (2016) Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. Math. Programming 155(1):549–573.Crossref, Google Scholar
[29] Polyak BT, Juditsky AB (1992) Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4):838–855.Crossref, Google Scholar
[30] Rhee C, Glynn PW (2015) Unbiased estimation with square root convergence for SDE models. Oper. Res. 63(5):1026–1043.Link, Google Scholar
[31] Ruppert D (1988) Efficient estimations from a slowly convergent Robbins-Monro process. Technical report, Cornell University Operations Research and Industrial Engineering, Ithaca, NY.Google Scholar
[32] Sportisse A, Boyer C, Dieuleveut A, Josse J (2020) Debiasing averaged stochastic gradient descent to handle missing values. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Proc. 34th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 12957–12967.Google Scholar
[33] Varre AV, Pillaud-Vivien L, Flammarion N (2021) Last iterate convergence of SGD for least-squares in the interpolation regime. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Proc. 35th Conf. Neural Inform. Processing Systems (NeurIPS 2021) (Curran Associates, Red Hook, NY), 21581–21591.Google Scholar
[34] Vihola M (2018) Unbiased estimators and multilevel Monte Carlo. Oper. Res. 66(2):448–462.Link, Google Scholar
[35] Wang T, Wang G (2023) Unbiased multilevel Monte Carlo methods for intractable distributions: MLMC meets MCMC. J. Machine Learn. Res. 24(249):1–40.Google Scholar
[36] Zhou Z, Mertikopoulos P, Bambos N, Glynn P, Ye Y (2022) Distributed stochastic optimization with large delays. Math. Oper. Res. 47(3):2082–2111.Link, Google Scholar

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:August 30, 2024
Accepted:October 06, 2025
Published Online:November 14, 2025

Cite as

Nabil Kahalé (2025) Unbiased Least Squares Regression via Averaged Stochastic Gradient Descent. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2024.0660

Keywords

Acknowledgments

The author thanks Francis Bach and Aaron Sidford for valuable discussions, and the Associate Editor as well as the three anonymous referees for their insightful comments and suggestions, which have improved the quality of the paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Unbiased Least Squares Regression via Averaged Stochastic Gradient Descent

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News