A Faster Index Algorithm and a Computational Study for Bandits with Switching Costs

Published Online:https://doi.org/10.1287/ijoc.1070.0238

References

  • Agrawal R., Hegde M. V., Teneketzis D. Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost. IEEE Trans. Automatic Control (1988) 33:899–906CrossrefGoogle Scholar
  • Asawa M., Teneketzis D. Multi-armed bandits with switching penalties. IEEE Trans. Automatic Control (1996) 41:328–348CrossrefGoogle Scholar
  • Banks J. S., Sundaram R. K. Switching costs and the Gittins index. Econometrica (1994) 62:687–694CrossrefGoogle Scholar
  • Dusonchet F., Hongler M.-O. Optimal hysteresis for a class of deterministic deteriorating two-armed bandit problem with switching costs. Automatica (2003) 39:1947–1955CrossrefGoogle Scholar
  • Gittins J. C. Bandit processes and dynamic allocation indices (with discussion). J. Roy. Statist. Soc. Ser. B (1979) 41:148–177Google Scholar
  • Gittins J. C., Jones D. M., Gani J., Sarkadi K., Vincze I. A dynamic allocation index for the sequential design of experiments. Progress in Statistics (European Meeting of Statisticians, Budapest, 1972) (1974) (North-Holland, Amsterdam) 241–266Google Scholar
  • Jun T. A survey on the bandit problem with switching costs. De Economist (2004) 152:513–541CrossrefGoogle Scholar
  • Niño-Mora J. Restless bandits, partial conservation laws and indexability. Adv. Appl. Probab. (2001) 33:76–98CrossrefGoogle Scholar
  • Niño-Mora J. Dynamic allocation indices for restless projects and queueing admission control: A polyhedral approach. Math. Programming (2002) 93:361–413CrossrefGoogle Scholar
  • Niño-Mora J. Restless bandit marginal productivity indices, diminishing returns and optimal control of make-to-order/make-to-stock M/G/1 queues. Math. Oper. Res. (2006) 31:50–84LinkGoogle Scholar
  • Niño-Mora J. A (2/3)n3 fast-pivoting algorithm for the Gittins index and optimal stopping of a Markov chain. INFORMS J. Comput. (2007a) 19:596–606LinkGoogle Scholar
  • Niño-Mora J. Dynamic priority allocation via restless bandit marginal productivity indices (with discussion). Top (2007b) 15:161–198CrossrefGoogle Scholar
  • Niño-Mora J. Computing an index policy for bandits with switching penalties. SMCtools '07: Proc. from the 2007 Workshop on Tools for Solving Structured Markov Chains (2007c) (ICST, Brussels) CrossrefGoogle Scholar
  • Reiman M. I., Wein L. M. Dynamic scheduling of a two-class queue with setups. Oper. Res. (1998) 46:532–547LinkGoogle Scholar
  • Van Oyen M. P., Teneketzis D. Optimal stochastic scheduling of forest networks with switching penalties. Adv. Appl. Probab. (1994) 26:474–497CrossrefGoogle Scholar
  • Varaiya P. P., Walrand J. C., Buyukkoc C. Extensions of the multiarmed bandit problem: The discounted case. IEEE Trans. Automatic Control (1985) 30:426–439CrossrefGoogle Scholar
  • Whittle P. Restless bandits: Activity allocation in a changing world. J. Appl. Probab. (1988) 25:287–298A Celebration of Applied Probability, J. Gani, ed. Applied Probability Trust, Sheffield, UKCrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.