Robust Markov Decision Processes: Beyond Rectangularity

Published Online:https://doi.org/10.1287/moor.2022.1259

References

  • [1] Akian M, Gaubert S, Grand-Clément J, Guillaud J (2019) The operator approach to entropy games. Theory Comput. Systems 63(5):1089–1130.CrossrefGoogle Scholar
  • [2] Alagoz O, Hsu H, Schaefer AJ, Roberts MS (2010) Markov decision processes: A tool for sequential decision making under uncertainty. Medical Decision Making 30(4):474–483.CrossrefGoogle Scholar
  • [3] Ben-Tal A, Nemirovski A (2000) Robust solutions of linear programming problems contaminated with uncertain data. Math. Programming 88(3):411–424.CrossrefGoogle Scholar
  • [4] Ben-Tal A, Nemirovski A (2001) Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2 (SIAM, Philadelphia, PA).CrossrefGoogle Scholar
  • [5] Bertsekas D (2011) Dynamic Programming and Optimal Control. 3rd ed., vol. 2 (Athena Scientific, Belmont, MA).Google Scholar
  • [6] Bertsimas D, Sim M (2004) The price of robustness. Oper. Res. 52(1):35–53.LinkGoogle Scholar
  • [7] Bertsimas D, Thiele A (2006) Robust and Data-Driven Optimization: Modern Decision Making Under Uncertainty. Models, Methods, and Applications for Innovative Decision Making. INFORMS TutORials in Operations Research, 95–122.Google Scholar
  • [8] Cheng G, Xie J, Zheng Z (2019) Optimal stopping for medical treatment with predictive information. Preprint, submitted June 13, https://dx.doi.org/10.2139/ssrn.3397530.Google Scholar
  • [9] Delage E, Mannor S (2010) Percentile optimization for Markov decision processes with parameter uncertainty. Oper. Res. 58(1):203–213.LinkGoogle Scholar
  • [10] Duan Y, Ke T, Wang M (2019) State aggregation learning from Markov transition data. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett r, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, New York). https://proceedings.neurips.cc/paper/2019/file/070dbb6024b5ef93784428afc71f2146-Paper.pdf.Google Scholar
  • [11] Epstein L, Schneider M (2003) Recursive multiple-priors. J. Econom. Theory 113(1):1–31.CrossrefGoogle Scholar
  • [12] Feinberg E, Shwartz A (2012) Handbook of Markov Decision Processes: Methods and Applications, vol. 40 (Springer Science & Business Media, Boston).Google Scholar
  • [13] Gaubert S, Gunawardena J (1998) A non-linear hierarchy for discrete event dynamical systems. Proc. 4th Workshop Discrete Event Systems, Cagliari, Italy, vol. 98.Google Scholar
  • [14] Givan R, Leach S, Dean T (2000) Bounded-parameter Markov decision processes. Artificial Intelligence 122(1–2):71–109.Google Scholar
  • [15] Goh J, Bayati M, Zenios SA, Singh S, Moore D (2018) Data uncertainty in Markov chains: Application to cost-effectiveness analyses of medical innovations. Oper. Res. 66(3):697–715.LinkGoogle Scholar
  • [16] Grand-Clément J, Chan CW, Goyal V, Escobar G (2020) Robust policies for proactive ICU transfers. Preprint, submitted February 14, https://arxiv.org/abs/2002.06247.Google Scholar
  • [17] Ho CP, Petrik M, Wiesemann W (2018) Fast Bellman updates for robust MDPs. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research Series, July 10–15, vol. 80 (PMLR), 1979–1988. http://proceedings.mlr.press/v80/ho18a/ho18a.pdf.Google Scholar
  • [18] Ho CP, Petrik M, Wiesemann W (2021) Partial policy iteration for l1-robust Markov decision processes. J. Machine Learn. Res. 22(275):1–46.Google Scholar
  • [19] Hordijk A, Dekker R, Kallenberg LCM (1985) Sensitivity-analysis in discounted Markovian decision problems. Oper. Res. Spektrum 7(3):143–151.CrossrefGoogle Scholar
  • [20] Iyengar G (2005) Robust dynamic programming. Math. Oper. Res. 30(2):257–280.LinkGoogle Scholar
  • [21] Mannor S, Mebel O, Xu H (2016) Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4):1484–1509.LinkGoogle Scholar
  • [22] Nilim A, El Ghaoui L (2005) Robust control of Markov decision processes with uncertain transition probabilities. Oper. Res. 53(5):780–798.LinkGoogle Scholar
  • [23] Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).CrossrefGoogle Scholar
  • [24] Ross IM (2009) A Primer on Pontryagin’s Principle in Optimal Control (Collegiate Publications, Carmel, CA).Google Scholar
  • [25] Satia JK, Lave JK (1973) Markov decision processes with uncertain transition probabilities. Oper. Res. 21(3):728–740.LinkGoogle Scholar
  • [26] Schaefer AJ, Bailey MD, Shechter SM, Roberts MS (2005) Modeling medical treatment using Markov decision processes. Brandeau ML, Sainfort F, Pierskalla WP, eds. Operations Research and Health Care, International Series in Operations Research & Management Science, vol. 70 (Springer, Boston), 593–612.CrossrefGoogle Scholar
  • [27] Steimle LN, Denton BT (2017) Markov decision processes for screening and treatment of chronic diseases. Boucherie R, van Dijk N eds. Markov Decision Processes in Practice, International Series in Operations Research & Management Science, vol. 248 (Springer, Cham, Switzerland), 189–222.CrossrefGoogle Scholar
  • [28] Wiesemann W, Kuhn D, Rustem B (2013) Robust Markov decision processes. Oper. Res. 38(1)153–183.AbstractGoogle Scholar
  • [29] Xu Y, Yin Y (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3)1758–1789.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.