Distributionally Robust Markov Decision Processes and Their Connection to Risk Measures

Nicole Bäuerle
Nicole Bäuerle
[email protected]
https://orcid.org/0000-0003-0077-3444
Department of Mathematics, Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany
Search for more papers by this author
,
Alexander Glauner
Alexander Glauner
[email protected]
https://orcid.org/0000-0002-7823-6035
Department of Mathematics, Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany
Search for more papers by this author

Department of Mathematics, Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany

Search for more papers by this author

Alexander Glauner

[email protected]

https://orcid.org/0000-0002-7823-6035

Department of Mathematics, Karlsruhe Institute of Technology, 76128 Karlsruhe, Germany

Search for more papers by this author

Published Online:23 Nov 2021https://doi.org/10.1287/moor.2021.1187

References

[1] Ahmadi-Javid A (2012) Entropic value-at-risk: A new coherent risk measure. J. Optim. Theory Appl. 155(3):1105–1123.Crossref, Google Scholar
[2] Aliprantis CD, Border KC (2006) Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd ed. (Springer-Verlag, Berlin).Google Scholar
[3] Anderson CL, Burke N, Davison M (2015) Optimal management of wind energy with storage: Structural implications for policy and market design. J. Energy Engrg. 141(1):B4014002.Crossref, Google Scholar
[4] Barbu V, Precupanu T (2012) Convexity and Optimization in Banach Spaces, 4th ed. (Springer Netherlands, Dordrecht, Netherlands).Crossref, Google Scholar
[5] Bäuerle N, Jaśkiewicz A (2017) Optimal dividend payout model with risk sensitive preferences. Insurance Math. Econom. 73:82–93.Crossref, Google Scholar
[6] Bäuerle N, Jaśkiewicz A (2018) Stochastic optimal growth model with risk sensitive preferences. J. Econom. Theory 173:181–200.Crossref, Google Scholar
[7] Bäuerle N, Ott J (2011) Markov decision processes with average-value-at-risk criteria. Math. Methods Oper. Res. 74(3):361–379.Crossref, Google Scholar
[8] Bäuerle N, Rieder U (2011) Markov Decision Processes with Applications to Finance (Springer-Verlag, Berlin).Crossref, Google Scholar
[9] Bäuerle N, Rieder U (2020) Markov Decision Processes Under Ambiguity, vol. 122 (Banach Center Publications), 25–39.Google Scholar
[10] Bielecki TR, Chen T, Cialenco I, Cousin A, Jeanblanc M (2019) Adaptive robust control under model uncertainty. SIAM J. Control Optim. 57(2):925–946.Crossref, Google Scholar
[11] Chow Y, Tamar A, Mannor S, Pavone M (2015) Risk-sensitive and robust decision-making: A CVaR optimization approach. Adv. Neural Inform. Processing Systems 28:1522–1530.Google Scholar
[12] Cichocki A, Amari S-i (2010) Families of alpha- beta- and gamma- divergences: Flexible and robust measures of similarities. Entropy 12(6):1532–1568.Crossref, Google Scholar
[13] Ellsberg D (1961) Risk, ambiguity, and the Savage axioms. Quart. J. Econom. 75(4):643–669.Crossref, Google Scholar
[14] Epstein LG, Schneider M (2003) Recursive multiple-priors. J. Econom. Theory 113(1):1–31.Crossref, Google Scholar
[15] Gil M, Alajaji F, Linder T (2013) Rényi divergence measures for commonly used univariate continuous distributions. Inform. Sci. 249:124–131.Crossref, Google Scholar
[16] Gilboa I, Schmeidler D (1989) Maxmin expected utility with a non-unique prior. J. Math. Econom. 18(2):141–153.Crossref, Google Scholar
[17] Glauner A (2020) Robust and risk-sensitive Markov decision processes with applications to dynamic optimal reinsurance. PhD thesis, Karlsruhe Institute of Technology, Karlsruhe, Germany.Google Scholar
[18] González-Trejo JI, Hernández-Lerma O, Hoyos-Reyes LF (2002) Minimax control of discrete-time stochastic systems. SIAM J. Control Optim. 41(5):1626–1659.Crossref, Google Scholar
[19] Hansen LP, Sargent TJ (1999) Five games and two objective functions that promote robustness. Manuscript, University of Chicago, Stanford University, and Hoover Institution, Chicago. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.35.2668&rep=rep1&type=pdf.Google Scholar
[20] Hernández-Lerma O (1989) Adaptive Markov Control Processes, Applied Mathematical Sciences, vol. 79 (Springer-Verlag, New York).Crossref, Google Scholar
[21] Hernández-Lerma O, Lasserre JB (1996) Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer-Verlag, New York).Crossref, Google Scholar
[22] Hernández-Lerma O, Lasserre JB (1999) Further Topics on Discrete-Time Markov Control Processes (Springer-Verlag, New York).Crossref, Google Scholar
[23] Iyengar GN (2005) Robust dynamic programming. Math. Oper. Res. 30(2):257–280.Link, Google Scholar
[24] Jaśkiewicz A, Nowak AS (2011) Stochastic games with unbounded payoffs: Applications to robust control in economics. Dynam. Games Appl. 1(2):253–279.Crossref, Google Scholar
[25] Jaśkiewicz A, Nowak AS (2014) Robust Markov control processes. J. Math. Anal. Appl. 420(2):1337–1353.Crossref, Google Scholar
[26] Maccheroni F, Marinacci M, Rustichini A (2006) Ambiguity aversion, robustness, and the variational representation of preferences. Econometrica 74(6):1447–1498.Crossref, Google Scholar
[27] Morrison TJ (2001) Functional Analysis: An Introduction to Banach Space Theory (John Wiley & Sons, New York).Google Scholar
[28] Müller A (1997) How does the value function of a Markov decision process depend on the transition probabilities? Math. Oper. Res. 22(4):872–885.Link, Google Scholar
[29] Nilim A, El Ghaoui L (2005) Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. 53(5):780–798.Link, Google Scholar
[30] Ó Searcóid M (2007) Metric Spaces (Springer-Verlag, London).Google Scholar
[31] Pichler A (2015) Premiums and reserves, adjusted by distortions. Scandinavian Actuarial J. 2015(4):332–351.Crossref, Google Scholar
[32] Pichler A, Schlotter R (2020) Entropy based risk measures. Eur. J. Oper. Res. 285(1):223–236.Crossref, Google Scholar
[33] Rieder U (1978) Measurable selection theorems for optimization problems. Manuscripta Math. 24(1):115–131.Google Scholar
[34] Rüschendorf L (2009) On the distributional transform, Sklar’s theorem, and the empirical copula process. J. Statist. Planning Inference 139(11):3921–3927.Crossref, Google Scholar
[35] Rüschendorf L (2013) Mathematical Risk Analysis: Dependence, Risk Bounds, Optimal Allocations and Portfolios (Springer-Verlag, Berlin).Crossref, Google Scholar
[36] Ruszczyński A (2010) Risk-averse dynamic programming for Markov decision processes. Math. Programming 125(2):235–261.Crossref, Google Scholar
[37] Shen Y, Stannat W, Obermayer K (2013) Risk-sensitive Markov control processes. SIAM J. Control Optim. 51(5):3652–3672.Crossref, Google Scholar
[38] Sion M (1958) On general minimax theorems. Pacific J. Math. 8(1):171–176.Crossref, Google Scholar
[39] Wiesemann W, Kuhn D, Rustem B (2013) Robust Markov decision processes. Math. Oper. Res. 38(1):153–183.Link, Google Scholar
[40] Xu H, Mannor S (2010) Distributionally robust Markov decision processes. Adv. Neural Inform. Processing Systems 23(2):2505–2513.Google Scholar
[41] Yang I (2017) A convex optimization approach to distributionally robust Markov decision processes with Wasserstein distance. IEEE Control Systems Lett. 1(1):164–169.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 47, Issue 3

August 2022

Pages 1707-2545, C2

Article Information

Metrics

Information

Received:July 25, 2020
Accepted:June 11, 2021
Published Online:November 23, 2021

Cite as

Nicole Bäuerle, Alexander Glauner (2021) Distributionally Robust Markov Decision Processes and Their Connection to Risk Measures. Mathematics of Operations Research 47(3):1757-1780.

https://doi.org/10.1287/moor.2021.1187

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Distributionally Robust Markov Decision Processes and Their Connection to Risk Measures

References

Volume 47, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News