Estimation Errors as Regret Lower Bounds for Linear Contextual Bandits

Jiahao He
Jiahao He
[email protected]
https://orcid.org/0000-0002-1825-9649
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong Special Administrative Region, China
Search for more papers by this author
,
Jiheng Zhang
Jiheng Zhang
[email protected]
https://orcid.org/0000-0003-3025-1495
Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong Special Administrative Region, China
Search for more papers by this author
,
Rachel Q. Zhang
Corresponding Author
Rachel Q. Zhang
[email protected]
https://orcid.org/0000-0002-0789-8488
Tsingshan Institute for Advanced Business Studies and School of Management, Zhejiang University, Hangzhou 310058, China
Search for more papers by this author

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong Special Administrative Region, China

Search for more papers by this author

Jiheng Zhang

[email protected]

https://orcid.org/0000-0003-3025-1495

Department of Industrial Engineering and Decision Analytics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong Special Administrative Region, China

Search for more papers by this author

Rachel Q. Zhang

Corresponding Author

Rachel Q. Zhang

[email protected]

https://orcid.org/0000-0002-0789-8488

Tsingshan Institute for Advanced Business Studies and School of Management, Zhejiang University, Hangzhou 310058, China

Search for more papers by this author

Published Online:2 Jun 2026https://doi.org/10.1287/mnsc.2023.02827

References

Acharya J, Sun Z, Zhang H (2021) Differentially private Assouad, Fano, and Le Cam. Proc. 32nd Internat. Conf. Algorithmic Learn. Theory (PMLR, New York), 48–78.Google Scholar
Agrawal S, Goyal N (2013) Thompson sampling for contextual bandits with linear payoffs. Proc. 30th Internat. Conf. Machine Learn. (PMLR, New York), 127–135.Google Scholar
Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J. Machine Learn. Res. 3:397–422.Google Scholar
Bastani H, Bayati M (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.Link, Google Scholar
Bastani H, Bayati M, Khosravi K (2021) Mostly exploration-free algorithms for contextual bandits. Management Sci. 67(3):1329–1349.Link, Google Scholar
Bryc W (1995) Rotation invariant distributions. The Normal Distribution: Characterizations with Applications, Lecture Notes in Statistics, vol. 100 (Springer, New York), 51–69.Crossref, Google Scholar
Cai TT, Wang Y, Zhang L (2020) The cost of privacy in generalized linear models: Algorithms and minimax lower bounds. Preprint, submitted December 6, https://arxiv.org/abs/2011.03900.Google Scholar
Cai TT, Wang Y, Zhang L (2021) The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. Ann. Statist. 49(5):2825–2850.Crossref, Google Scholar
Chen X, Simchi-Levi D, Wang Y (2022) Privacy-preserving dynamic personalized pricing with demand learning. Management Sci. 68(7):4878–4898.Link, Google Scholar
Chen Y, Wang Y, Fang EX, Wang Z, Li R (2024) Nearly dimension-independent sparse linear bandit over small action spaces via best subset selection. J. Amer. Statist. Assoc. 119(545):246–258.Crossref, Google Scholar
Chu W, Li L, Reyzin L, Schapire R (2011) Contextual bandits with linear payoff functions. Proc. Fourteenth Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 208–214.Google Scholar
Duchi J (2016) Lecture notes for Statistics 311/Electrical Engineering 377. Accessed April 18, 2026, https://stanford.edu/class/stats311/OldSyllabi/full_notes.pdf.Google Scholar
Duchi JC, Wainwright MJ (2013) Distance-based and continuum Fano inequalities with applications to statistical estimation. Preprint, submitted December 31, https://arxiv.org/abs/1311.2669.Google Scholar
Duchi JC, Jordan MI, Wainwright MJ (2018) Minimax optimal procedures for locally private estimation. J. Amer. Statist. Assoc. 113(521):182–201.Crossref, Google Scholar
Goldenshluger A, Zeevi A (2013) A linear response bandit problem. Stochastic Systems 3(1):230–261.Link, Google Scholar
Han Y, Liang Z, Wang Y, Zhang J (2021) Generalized linear bandits with local differential privacy. Adv. Neural Inform. Processing Systems 34:26511–26522.Google Scholar
Han Y, Zhou Z, Zhou Z, Blanchet J, Glynn PW, Ye Y (2020) Sequential batch learning in finite-action linear contextual bandits. Preprint, submitted April 14, https://arxiv.org/abs/2004.06321.Google Scholar
He J, Zhang J, Zhang RQ (2022) A reduction from linear contextual bandits lower bounds to estimations lower bounds. Proc. 39th Internat. Conf. Machine Learn. (PMLR, New York), 8660–8677.Google Scholar
Kannan S, Morgenstern JH, Roth A, Waggoner B, Wu ZS (2018) A smoothed analysis of the greedy algorithm for the linear contextual bandit problem. Proc. 32nd Internat. Conf. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 2231–2241. Google Scholar
Kim W, Lee K, Paik MC (2023) Double doubly robust Thompson sampling for generalized linear contextual bandits. Proc. 37th AAAI Conf. Artificial Intelligence and 35th Conf. Innovative Appl. Artificial Intelligence and 13th Sympos. Educational Adv. Artificial Intelligence (AAAI Press, Palo Alto, CA), 8300–8307. Google Scholar
Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Li W, Barik A, Honorio J (2022) A simple unified framework for high dimensional bandit problems. Proc. 39th Internat. Conf. Machine Learn. (PMLR, New York), 12619–12655.Google Scholar
Li Y, Wang Y, Zhou Y (2019) Nearly minimax-optimal regret for linearly parameterized bandits. Proc. 32nd Conf. Learn. Theory (PMLR, New York), 2173–2174.Google Scholar
Li L, Chu W, Langford J, Schapire RE (2010) A contextual-bandit approach to personalized news article recommendation. WWW’10 Proc. 19th Internat. Conf. World Wide Web (Association for Computing Machinery, New York), 661–670.Google Scholar
Oh MH, Iyengar G (2021) Multinomial logit contextual bandits: Provable optimality and practicality. Proc. AAAI Conf. Artificial Intelligence 35(10):9205–9213.Google Scholar
Oh MH, Iyengar G, Zeevi A (2021) Sparsity-agnostic lasso bandit. Proc. 38th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 8271–8280.Google Scholar
Ren Z, Zhou Z (2024) Dynamic batch learning in high-dimensional sparse linear contextual bandits. Management Sci. 70(2):1315–1342.Link, Google Scholar
Shariff R, Sheffet O (2018) Differentially private contextual linear bandits. Adv. Neural Inform. Processing Systems 31:4296–4306. Google Scholar
Wang D, Xu J (2019) On sparse linear regression in the local differential privacy model. Proc. 36th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 97 (PMLR, New York), 6628–6637.Google Scholar
Zheng K, Cai T, Huang W, Li Z, Wang L (2020) Locally differentially private (contextual) bandits learning. Adv. Neural Inform. Processing Systems 33:12300–12310.Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:September 05, 2023
Accepted:March 08, 2026
Published Online:June 02, 2026

Cite as

Jiahao He, Jiheng Zhang, Rachel Q. Zhang (2026) Estimation Errors as Regret Lower Bounds for Linear Contextual Bandits. Management Science 0(0).

https://doi.org/10.1287/mnsc.2023.02827

Keywords

Acknowledgments

A preliminary version of this paper (He et al. 2022) appeared in the 39th International Conference on Machine Learning, and the current paper is a significantly enhanced version of it.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Estimation Errors as Regret Lower Bounds for Linear Contextual Bandits

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News