Index Policies and Performance Bounds for Dynamic Selection Problems

Published Online:https://doi.org/10.1287/mnsc.2019.3342

References

  • Adelman D, Mersereau AJ (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.LinkGoogle Scholar
  • Bernstein F, Kök AG, Xie L (2015) Dynamic assortment customization with limited inventories. Manufacturing Service Oper. Management 17(4):538–553.LinkGoogle Scholar
  • Bertsekas DP, Nedić A, Ozdaglar AE (2003) Convex Analysis and Optimization (Athena Scientific, Belmont, MA).Google Scholar
  • Bertsimas D, Mersereau AJ (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.LinkGoogle Scholar
  • Bertsimas D, Mišić VV (2016) Decomposable Markov decision processes: A fluid optimization approach. Oper. Res. 64(6):1537–1555.LinkGoogle Scholar
  • Bertsimas D, Niño-Mora J (2000) Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Oper. Res. 48(1):80–90.LinkGoogle Scholar
  • Brown DB, Smith JE (2014) Information relaxations, duality, and convex stochastic dynamic programs. Oper. Res. 62(6):1394–1415.LinkGoogle Scholar
  • Brown DB, Smith JE, Sun P (2010) Information relaxations and duality in stochastic dynamic programs. Oper. Res. 584(1):785–801.LinkGoogle Scholar
  • Caro F, Gallien J (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.LinkGoogle Scholar
  • Gittins J, Glazebrook K, Weber R (2011) Multi-Armed Bandit Allocation Indices (John Wiley & Sons, Chichester, UK).CrossrefGoogle Scholar
  • Hawkins JT (2003) A Langrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. PhD thesis, Massachusetts Institute of Technology, Cambridge.Google Scholar
  • Hodge DJ, Glazebrook KD (2015) On the asymptotic optimality of greedy index heuristics for multi-action restless bandits. Adv. Appl. Probab. 47(3):652–667.CrossrefGoogle Scholar
  • Hu W, Frazier P (2017), An asymptotically optimal index policy for finite-horizon restless bandits. Working paper, Cornell University, Ithaca, NY.Google Scholar
  • Kök AG, Fisher ML, Vaidyanathan R (2008) Assortment planning: Review of literature and industry practice. Retail Supply Chain Management, International Series in Operations Research & Management Science, vol. 223 (Springer, Boston), 99–153.CrossrefGoogle Scholar
  • Le Boudec J-Y, McDonald D, Mundinger J (2007) A generic mean field convergence result for systems of interacting objects. Proc. 4th Internat. Conf. Quantitative Evaluation of Systems (Institute of Electrical and Electronics Engineers, Washington, DC), 3–18.Google Scholar
  • Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, Hoboken, NJ).CrossrefGoogle Scholar
  • Rusmevichientong P, Shen Z-JM, Shmoys DB (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.LinkGoogle Scholar
  • Topaloglu H (2009) Using Lagrangian relaxation to compute capacity-dependent bid prices in network revenue management. Oper. Res. 57(3):637–649.LinkGoogle Scholar
  • Weber RR, Weiss G (1990) On an index policy for restless bandits. J. Appl. Probab. 27(3):637–648.CrossrefGoogle Scholar
  • Whittle P (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25(A):287–298.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.