Estimation Errors as Regret Lower Bounds for Linear Contextual Bandits
Abstract
Linear contextual bandits and their variants represent a fundamental class of models with wide real-world applications, usually solved using algorithms guided by parameter estimation. The Cauchy-Schwarz inequality established analytically that estimation errors dominate algorithm regrets. Therefore, accurate parameter estimation suffices to guarantee algorithms with low regrets. In this paper, we establish the necessity of accurate estimations in effective algorithms for linear contextual bandit problems by first constructing an estimator for any given algorithm. We then show that algorithm regrets dominate the estimation errors of their induced estimators under mild conditions. In other words, low-regret algorithms must imply accurate estimators, and developing low-regret algorithms is equivalent to finding efficient estimators, either implicitly or explicitly. Thus, our analysis reduces regret lower bounds to estimation errors, bridging lower bound analysis in bandit problems and regression analysis. This provides a framework for finding practical and informative regret lower bounds by leveraging the extensive estimation literature in Statistics. It leads to insightful lower bounds for a variety of contextual bandit problems in the literature, which are either new or tighter than existing ones.
This paper was accepted by J. George Shanthikumar, data science.
Funding: Financial support from the Hong Kong Research Grants Council [Grants 16200821, 16500023, 16500225, and T32-615/24-R] is gratefully acknowledged.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/mnsc.2023.02827.

