Dynamic Programs with Shared Resources and Signals: Dynamic Fluid Policies and Asymptotic Optimality
References
- (2008) Relaxations of weakly coupled stochastic dynamic programs. Oper. Res. 56(3):712–727.Link, Google Scholar
- (2013) Thompson sampling for contextual bandits with linear payoffs. Dasgupta S, McAllester D, eds. Proc. 30th Internat. Conf. Machine Learning, Atlanta, June 17–19, 127–135.Google Scholar
- (2016) Response-adaptive designs for clinical trials: Simultaneous learning from multiple patients. Eur. J. Oper. Res. 248(2):619–633.Crossref, Google Scholar
- (2021) Dynamic pricing of relocating resources in large networks. Management Sci. 67(7):4075–4094.Link, Google Scholar
- (2007) A learning approach for interactive marketing to a customer segment. Oper. Res. 55(6):1120–1135.Link, Google Scholar
- (2016) Decomposable Markov decision processes: A fluid optimization approach. Oper. Res. 64(6):1537–1555.Link, Google Scholar
- (2020) Near-optimal ab testing. Management Sci. 66(10):4477–4495.Link, Google Scholar
- (2008) Successive linear approximation solution of infinite-horizon dynamic stochastic programs. SIAM J. Optim. 18(4):1165–1186.Crossref, Google Scholar
- (2019) Markov decision processes with exogenous variables. Management Sci. 65(10):4598–4606.Link, Google Scholar
- (2020) Index policies and performance bounds for dynamic selection problems. Management Sci. 66(7):3029–3050.Link, Google Scholar
- (2007) Dynamic assortment with demand learning for seasonal consumer goods. Management Sci. 53(2):276–292.Link, Google Scholar
- (2001) Optimal policies for multiechelon inventory problems with Markov-modulated demand. Oper. Res. 49(2):226–234.Link, Google Scholar
- (1999) Convergent cutting-plane and partial-sampling algorithm for multistage stochastic linear programs with recourse. J. Optim. Theory Appl. 102(3):497–524.Crossref, Google Scholar
- (2008) Stochastic linear optimization under bandit feedback. 21st Annual Conf. Learning Theory, Helsinki, Finland, July 9–12, 355–366.Google Scholar
- (1998) Optimal production and inventory policy for multiple products under resource constraints. Management Sci. 44(7):950–961.Link, Google Scholar
- (1990) Optimal centralized ordering policies in multi-echelon inventory systems with correlated demands. Management Sci. 36(3):381–392.Link, Google Scholar
- (2003) A Langrangian decomposition approach to weakly coupled dynamic optimization problems and its applications. Unpublished doctoral dissertation, Massachusetts Institute of Technology.Google Scholar
- (2017) An asymptotically optimal index policy for finite-horizon restless bandits. Preprint, submitted July 1, https://arxiv.org/abs/1707.00205.Google Scholar
- (2005) On the convergence of sampling-based decomposition algorithms for multistage stochastic programs. J. Optim. Theory Appl. 125(2):349–366.Crossref, Google Scholar
- (2020) Asymptotically optimal Lagrangian policies for one-warehouse multi-store system with lost sales. Preprint, submitted April 6, https://doi.org/10.2139/ssrn.3552995.Google Scholar
- (2020) Improving clinical trial enrollment—in the Covid-19 era and beyond. New England J. Medicine 383(15):1406–1408.Crossref, Google Scholar
- (1991) Multi-stage stochastic optimization applied to energy planning. Math. Programming 52(1-3):359–375.Crossref, Google Scholar
- (2008) On the convergence of stochastic dual dynamic programming and related methods. Oper. Res. Lett. 36(4):450–455.Crossref, Google Scholar
- (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, New York).Crossref, Google Scholar
- (2018) Learning to optimize via information-directed sampling. Oper. Res. 66(1):230–252.Link, Google Scholar
- (2011) Analysis of stochastic dual dynamic programming method. Eur. J. Oper. Res. 209(1):63–72.Crossref, Google Scholar
- (1993) Inventory control in a fluctuating demand environment. Oper. Res. 41(2):351–370.Link, Google Scholar
- (2009) Using Lagrangian relaxation to compute capacity-dependent bid prices in network revenue management. Oper. Res. 57(3):637–649.Link, Google Scholar
- (1990) On an index policy for restless bandits. J. Appl. Probab. 27(3):637–648.Crossref, Google Scholar
- (1988) Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25:287–298.Crossref, Google Scholar
- (2019) An asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action bandits. Adv. Appl. Probab. 51(3):745–772.Crossref, Google Scholar

