Uniform Turnpike Theorems for Finite Markov Decision Processes

Mark E. Lewis
Corresponding Author
Mark E. Lewis
http://orcid.org/0000-0002-8428-9547
School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853;
Search for more papers by this author
,
Anand Paul
Anand Paul
Warrington College of Business, University of Florida, Gainesville, Florida 32611
Search for more papers by this author

Mark E. Lewis

Corresponding Author

Mark E. Lewis

http://orcid.org/0000-0002-8428-9547

School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853;

Search for more papers by this author

Anand Paul

Warrington College of Business, University of Florida, Gainesville, Florida 32611

Search for more papers by this author

Published Online:14 Sep 2018https://doi.org/10.1287/moor.2017.0912

References

Feinberg EA, Huang J (2014) The value iteration algorithm is not strongly polynomial for discounted dynamic programming. Oper. Res. Lett. 42(2):130–131.Crossref, Google Scholar
Goldberg RG (1976) Methods of Real Analysis, 2nd ed. (John Wiley & Sons, Hoboken, NJ).Google Scholar
Hordijk A, Dekker R, Kallenberg LCM (1985) Sensitivity-analysis in discounted Markovian decision problems. OR Spektrum 7(3):143–151.Crossref, Google Scholar
Kallenberg L (2009) Markov decision processes. Lecture notes, University of Leiden, Leiden, Netherlands. http://www.math.leidenuniv.nl/~kallenberg/Lecture-notes-MDP.pdf.Google Scholar
Khan MA, Piazza A (2011) An overview of turnpike theory: Towards the discounted deterministic case. Kusuoka S, Maruyama T, eds. Advances on Mathematical Economics, Vol. 14 (Springer, Tokyo), 39–67.Crossref, Google Scholar
Puterman ML (1994) Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley Series in Probability and Mathematical Statistics (John Wiley & Sons, New York).Crossref, Google Scholar
Sennott LI (1999) Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability and Statistics (John Wiley & Sons, New York).Google Scholar
Shapiro JF (1968) Turnpike planning horizons for a Markovian decision model. Management Sci. 14(5):292–300.Link, Google Scholar
Smallwood RD (1966) Optimum policy regions for Markov processes with discounting. Oper. Res. 14(4):658–669.Link, Google Scholar
Tseng P (1990) Solving H-horizon, stationary Markov decision problems in time proportional to log(H). Oper. Res. Lett. 9(5):287–297.Crossref, Google Scholar
Ye Y (2011) The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4):593–603.Link, Google Scholar
Zaslavski AJ (2014) Stability of the Turnpike Phenomenon in Discrete-Time Optimal Control Problems (Springer International Publishing AG, Cham, Switzerland).Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 44, Issue 4

November 2019

Pages 1145-1509, C2

Article Information

Metrics

Information

Received:May 23, 2016
Accepted:August 29, 2017
Published Online:September 14, 2018

Cite as

Mark E. Lewis, Anand Paul (2018) Uniform Turnpike Theorems for Finite Markov Decision Processes. Mathematics of Operations Research 44(4):1145-1160.

https://doi.org/10.1287/moor.2017.0912

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Uniform Turnpike Theorems for Finite Markov Decision Processes

References

Volume 44, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News