Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms

Will Ma
Corresponding Author
Will Ma
[email protected]
http://orcid.org/0000-0002-2420-4468
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Will Ma

Corresponding Author

Will Ma

[email protected]

http://orcid.org/0000-0002-2420-4468

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:5 Feb 2018https://doi.org/10.1287/moor.2017.0884

Abstract

We study the multi-armed bandit problem with arms which are Markov chains with rewards. In the finite-horizon setting, the celebrated Gittins indices do not apply, and the exact solution is intractable. We provide approximation algorithms for the general model of Markov decision processes with nonunit transition times. When preemption isn’t allowed, we provide a (1/2 − ε)-approximation, along with an example showing this is tight. When preemption is allowed, we provide a 1/12-approximation, which improves to a 4/27-approximation when transition times are unity. Our model captures the Markovian Bandits model of Gupta et al., the Stochastic Knapsack model of Dean et al., and the Budgeted Learning model of Guha and Munagala. Our algorithms improve existing results in all three areas. In our analysis, we encounter and overcome to our knowledge a new obstacle: an algorithm that provably exists via analytical arguments, but cannot be found in polynomial time.

cover image Mathematics of Operations Research

Volume 43, Issue 3

August 2018

Pages 693-1050, C2

Article Information

Metrics

Information

Received:February 28, 2016
Accepted:June 07, 2017
Published Online:February 05, 2018

Cite as

Will Ma (2018) Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms. Mathematics of Operations Research 43(3):789-812.

https://doi.org/10.1287/moor.2017.0884

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Improvements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation Algorithms

Abstract

Volume 43, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News