Open Access

A Linear Response Bandit Problem

Alexander Goldenshluger
Alexander Goldenshluger
[email protected]
Department of Statistics, University of Haifa, Haifa 31905, Israel
Search for more papers by this author
,
Assaf Zeevi
Assaf Zeevi
[email protected]
Graduate School of Business, Columbia University New York, NY 10027, USA
Search for more papers by this author

Alexander Goldenshluger

[email protected]

Department of Statistics, University of Haifa, Haifa 31905, Israel

Search for more papers by this author

Assaf Zeevi

[email protected]

Graduate School of Business, Columbia University New York, NY 10027, USA

Search for more papers by this author

Published Online:26 Aug 2013https://doi.org/10.1287/11-SSY032

Abstract

We consider a two–armed bandit problem which involves sequential sampling from two non-homogeneous populations. The response in each is determined by a random covariate vector and a vector of parameters whose values are not known a priori. The goal is to maximize cumulative expected reward. We study this problem in a minimax setting, and develop rate-optimal polices that combine myopic action based on least squares estimates with a suitable “forced sampling” strategy. It is shown that the regret grows logarithmically in the time horizon n and no policy can achieve a slower growth rate over all feasible problem instances. In this setting of linear response bandits, the identity of the sub-optimal action changes with the values of the covariate vector, and the optimal policy is subject to sampling from the inferior population at a rate that grows like $\sqrt{n}$ .

Volume 3, Issue 1

June 2013

Pages 1-321

Article Information

Metrics

Information

Received:August 01, 2011
Published Online:August 26, 2013

Cite as

Alexander Goldenshluger, Assaf Zeevi (2013) A Linear Response Bandit Problem. Stochastic Systems 3(1):230-261.

https://doi.org/10.1287/11-SSY032

Keywords

PDF download

Available Issues

Available Issues

Available Issues

A Linear Response Bandit Problem

Abstract

Volume 3, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News