Index Policies and Performance Bounds for Dynamic Selection Problems
Published Online:23 Jan 2020https://doi.org/10.1287/mnsc.2019.3342
References
- (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.Link, Google Scholar
- (2015) Dynamic assortment customization with limited inventories. Manufacturing Service Oper. Management 17(4):538–553.Link, Google Scholar
- (2003) Convex Analysis and Optimization (Athena Scientific, Belmont, MA).Google Scholar
- (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.Link, Google Scholar
- (2016) Decomposable Markov decision processes: A fluid optimization approach. Oper. Res. 64(6):1537–1555.Link, Google Scholar
- (2000) Restless bandits, linear programming relaxations, and a primal-dual index heuristic. Oper. Res. 48(1):80–90.Link, Google Scholar
- (2014) Information relaxations, duality, and convex stochastic dynamic programs. Oper. Res. 62(6):1394–1415.Link, Google Scholar
- (2010) Information relaxations and duality in stochastic dynamic programs. Oper. Res. 584(1):785–801.Link, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (2011) Multi-Armed Bandit Allocation Indices (John Wiley & Sons, Chichester, UK).Crossref, Google Scholar
- (2003) A Langrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. PhD thesis, Massachusetts Institute of Technology, Cambridge.Google Scholar
- (2015) On the asymptotic optimality of greedy index heuristics for multi-action restless bandits. Adv. Appl. Probab. 47(3):652–667.Crossref, Google Scholar
- (2017), An asymptotically optimal index policy for finite-horizon restless bandits. Working paper, Cornell University, Ithaca, NY.Google Scholar
- (2008) Assortment planning: Review of literature and industry practice. Retail Supply Chain Management, International Series in Operations Research & Management Science, vol. 223 (Springer, Boston), 99–153.Crossref, Google Scholar
- (2007) A generic mean field convergence result for systems of interacting objects. Proc. 4th Internat. Conf. Quantitative Evaluation of Systems (Institute of Electrical and Electronics Engineers, Washington, DC), 3–18.Google Scholar
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
- (2010) Dynamic assortment optimization with a multinomial logit choice model and capacity constraint. Oper. Res. 58(6):1666–1680.Link, Google Scholar
- (2009) Using Lagrangian relaxation to compute capacity-dependent bid prices in network revenue management. Oper. Res. 57(3):637–649.Link, Google Scholar
- (1990) On an index policy for restless bandits. J. Appl. Probab. 27(3):637–648.Crossref, Google Scholar
- (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25(A):287–298.Crossref, Google Scholar

