Robust Markov Decision Processes: Beyond Rectangularity

Vineet Goyal
Vineet Goyal
[email protected]
https://orcid.org/0000-0001-6719-3212
Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027;
Search for more papers by this author
,
Julien Grand-Clément
Julien Grand-Clément
[email protected]
https://orcid.org/0000-0002-2864-8779
Department of Information Systems and Operations Management, Ecole des Hautes Études Commerciales (HEC) de Paris, 78350 Jouy-en-Josas, France
Search for more papers by this author

Department of Industrial Engineering and Operations Research, Columbia University, New York, New York 10027;

Department of Information Systems and Operations Management, Ecole des Hautes Études Commerciales (HEC) de Paris, 78350 Jouy-en-Josas, France

Search for more papers by this author

Published Online:20 Apr 2022https://doi.org/10.1287/moor.2022.1259

References

[1] Akian M, Gaubert S, Grand-Clément J, Guillaud J (2019) The operator approach to entropy games. Theory Comput. Systems 63(5):1089–1130.Crossref, Google Scholar
[2] Alagoz O, Hsu H, Schaefer AJ, Roberts MS (2010) Markov decision processes: A tool for sequential decision making under uncertainty. Medical Decision Making 30(4):474–483.Crossref, Google Scholar
[3] Ben-Tal A, Nemirovski A (2000) Robust solutions of linear programming problems contaminated with uncertain data. Math. Programming 88(3):411–424.Crossref, Google Scholar
[4] Ben-Tal A, Nemirovski A (2001) Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, vol. 2 (SIAM, Philadelphia, PA).Crossref, Google Scholar
[5] Bertsekas D (2011) Dynamic Programming and Optimal Control. 3rd ed., vol. 2 (Athena Scientific, Belmont, MA).Google Scholar
[6] Bertsimas D, Sim M (2004) The price of robustness. Oper. Res. 52(1):35–53.Link, Google Scholar
[7] Bertsimas D, Thiele A (2006) Robust and Data-Driven Optimization: Modern Decision Making Under Uncertainty. Models, Methods, and Applications for Innovative Decision Making. INFORMS TutORials in Operations Research, 95–122.Google Scholar
[8] Cheng G, Xie J, Zheng Z (2019) Optimal stopping for medical treatment with predictive information. Preprint, submitted June 13, https://dx.doi.org/10.2139/ssrn.3397530.Google Scholar
[9] Delage E, Mannor S (2010) Percentile optimization for Markov decision processes with parameter uncertainty. Oper. Res. 58(1):203–213.Link, Google Scholar
[10] Duan Y, Ke T, Wang M (2019) State aggregation learning from Markov transition data. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett r, eds. Advances in Neural Information Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, New York). https://proceedings.neurips.cc/paper/2019/file/070dbb6024b5ef93784428afc71f2146-Paper.pdf.Google Scholar
[11] Epstein L, Schneider M (2003) Recursive multiple-priors. J. Econom. Theory 113(1):1–31.Crossref, Google Scholar
[12] Feinberg E, Shwartz A (2012) Handbook of Markov Decision Processes: Methods and Applications, vol. 40 (Springer Science & Business Media, Boston).Google Scholar
[13] Gaubert S, Gunawardena J (1998) A non-linear hierarchy for discrete event dynamical systems. Proc. 4th Workshop Discrete Event Systems, Cagliari, Italy, vol. 98.Google Scholar
[14] Givan R, Leach S, Dean T (2000) Bounded-parameter Markov decision processes. Artificial Intelligence 122(1–2):71–109.Google Scholar
[15] Goh J, Bayati M, Zenios SA, Singh S, Moore D (2018) Data uncertainty in Markov chains: Application to cost-effectiveness analyses of medical innovations. Oper. Res. 66(3):697–715.Link, Google Scholar
[16] Grand-Clément J, Chan CW, Goyal V, Escobar G (2020) Robust policies for proactive ICU transfers. Preprint, submitted February 14, https://arxiv.org/abs/2002.06247.Google Scholar
[17] Ho CP, Petrik M, Wiesemann W (2018) Fast Bellman updates for robust MDPs. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn. Proceedings of Machine Learning Research Series, July 10–15, vol. 80 (PMLR), 1979–1988. http://proceedings.mlr.press/v80/ho18a/ho18a.pdf.Google Scholar
[18] Ho CP, Petrik M, Wiesemann W (2021) Partial policy iteration for l1-robust Markov decision processes. J. Machine Learn. Res. 22(275):1–46.Google Scholar
[19] Hordijk A, Dekker R, Kallenberg LCM (1985) Sensitivity-analysis in discounted Markovian decision problems. Oper. Res. Spektrum 7(3):143–151.Crossref, Google Scholar
[20] Iyengar G (2005) Robust dynamic programming. Math. Oper. Res. 30(2):257–280.Link, Google Scholar
[21] Mannor S, Mebel O, Xu H (2016) Robust MDPs with k-rectangular uncertainty. Math. Oper. Res. 41(4):1484–1509.Link, Google Scholar
[22] Nilim A, El Ghaoui L (2005) Robust control of Markov decision processes with uncertain transition probabilities. Oper. Res. 53(5):780–798.Link, Google Scholar
[23] Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming (John Wiley & Sons, New York).Crossref, Google Scholar
[24] Ross IM (2009) A Primer on Pontryagin’s Principle in Optimal Control (Collegiate Publications, Carmel, CA).Google Scholar
[25] Satia JK, Lave JK (1973) Markov decision processes with uncertain transition probabilities. Oper. Res. 21(3):728–740.Link, Google Scholar
[26] Schaefer AJ, Bailey MD, Shechter SM, Roberts MS (2005) Modeling medical treatment using Markov decision processes. Brandeau ML, Sainfort F, Pierskalla WP, eds. Operations Research and Health Care, International Series in Operations Research & Management Science, vol. 70 (Springer, Boston), 593–612.Crossref, Google Scholar
[27] Steimle LN, Denton BT (2017) Markov decision processes for screening and treatment of chronic diseases. Boucherie R, van Dijk N eds. Markov Decision Processes in Practice, International Series in Operations Research & Management Science, vol. 248 (Springer, Cham, Switzerland), 189–222.Crossref, Google Scholar
[28] Wiesemann W, Kuhn D, Rustem B (2013) Robust Markov decision processes. Oper. Res. 38(1)153–183.Abstract, Google Scholar
[29] Xu Y, Yin Y (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3)1758–1789.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 48, Issue 1

February 2023

Pages 1-602, C2

Article Information

Metrics

Information

Received:July 21, 2020
Accepted:January 03, 2022
Published Online:April 20, 2022

Cite as

Vineet Goyal, Julien Grand-Clément (2022) Robust Markov Decision Processes: Beyond Rectangularity. Mathematics of Operations Research 48(1):203-226.

https://doi.org/10.1287/moor.2022.1259

Keywords

Acknowledgments

The authors thank an anonymous associate editor and three referees for the detailed comments that helped improve the manuscript.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Robust Markov Decision Processes: Beyond Rectangularity

References

Volume 48, Issue 1

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News