Dynamic Programming Deconstructed: Transformations of the Bellman Equation and Computational Efficiency

Qingyin Ma
Qingyin Ma
[email protected]
https://orcid.org/0000-0001-8862-4210
International School of Economics and Management, Capital University of Economics and Business, Beijing 100070, China;
Search for more papers by this author
,
John Stachurski
Corresponding Author
John Stachurski
[email protected]
https://orcid.org/0000-0001-6716-0111
Research School of Economics, Australian National University, Acton, ACT 2601, Australia
Search for more papers by this author

International School of Economics and Management, Capital University of Economics and Business, Beijing 100070, China;

Search for more papers by this author

John Stachurski

Corresponding Author

John Stachurski

[email protected]

https://orcid.org/0000-0001-6716-0111

Research School of Economics, Australian National University, Acton, ACT 2601, Australia

Search for more papers by this author

Published Online:8 Feb 2021https://doi.org/10.1287/opre.2020.2006

References

Bagger J , Fontaine F , Postel-Vinay F , Robin JM (2014) Tenure, experience, human capital, and wages: A tractable equilibrium search model of wage dynamics. Amer. Econom. Rev. 104(6):1551–1596.Crossref, Google Scholar
Bäuerle N , Jaśkiewicz A (2018) Stochastic optimal growth model with risk sensitive preferences. J. Econom. Theory 173:181–200.Crossref, Google Scholar
Bellman R (1957) Dynamic Programming (Princeton University Press, New York).Google Scholar
Bertsekas DP (2012) Dynamic Programming and Optimal Control , vol. 2, 4th ed. (Athena Scientific, Massachusetts).Google Scholar
Bertsekas DP (2013) Abstract Dynamic Programming (Athena Scientific, Belmont, MA).Google Scholar
Bertsekas DP , Yu H (2012) Q-learning and enhanced policy iteration in discounted dynamic programming. Math. Oper. Res. 37(1):66–94.Link, Google Scholar
Bidder RM , Smith ME (2012) Robust animal spirits. J. Monetary Econom. 59(8):738–750.Crossref, Google Scholar
Bloise G , Vailakis Y (2018) Convex dynamic programming with (bounded) recursive utility. J. Econom. Theory 173:118–141.Crossref, Google Scholar
Dixit AK , Pindyck RS (1994) Investment Under Uncertainty (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Hansen LP , Sargent TJ (2008) Robustness (Princeton University Press, Princeton, NJ).Crossref, Google Scholar
Iyengar GN (2005) Robust dynamic programming. Math. Oper. Res. 30(2):257–280.Link, Google Scholar
Kellogg R (2014) The effect of uncertainty on investment: Evidence from Texas oil drilling. Amer. Econom. Rev. 104(6):1698–1734.Crossref, Google Scholar
Kochenderfer MJ (2015) Decision Making Under Uncertainty: Theory and Application (MIT Press, Cambridge, MA).Crossref, Google Scholar
Kristensen D , Mogensen P , Moon JM , Schjerning B (2018) Solving dynamic discrete choice models using smoothing and sieve methods. Technical report, University of Copenhagen, Copenhagen.Google Scholar
Livshits I , MacGee J , Tertilt M (2007) Consumer bankruptcy: A fresh start. Amer. Econom. Rev. 97(1):402–418.Crossref, Google Scholar
Low H , Meghir C , Pistaferri L (2010) Wage risk and employment risk over the life cycle. Amer. Econom. Rev. 100(4):1432–1467.Crossref, Google Scholar
Marinacci M , Montrucchio L (2010) Unique solutions for stochastic recursive utilities. J. Econom. Theory 145(5):1776–1804.Crossref, Google Scholar
McCall JJ (1970) Economics of information and job search. Quart. J. Econom. 84(1):113–126.Crossref, Google Scholar
Monahan GE (1980) Optimal stopping in a partially observable Markov process with costly information. Oper. Res. 28(6):1319–1334.Link, Google Scholar
Munos R , Szepesvári C (2008) Finite-time bounds for fitted value iteration. J. Machine Learn. Res. 9(May):815–857.Google Scholar
Peskir G , Shiryaev A (2006) Optimal Stopping and Free-Boundary Problems (Springer, Berlin).Google Scholar
Powell WB (2007) Approximate Dynamic Programming: Solving the Curses of Dimensionality (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar
Puterman ML , Shin MC (1982) Action elimination procedures for modified policy iteration algorithms. Oper. Res. 30(2):301–318.Link, Google Scholar
Rust J (1987) Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica 55(5):999–1033.Crossref, Google Scholar
Rust J (1994) Structural estimation of Markov decision processes. Engle RF, McFadden DL, eds. Handbook of Econometrics, vol. 4 (Elsevier, Amsterdam), 3081–3143.Google Scholar
Rust J (1996) Numerical dynamic programming in economics. Amman H, Kendrick D, Rust J. Handbook of Computational Economics, vol. 1 (Elsevier, North-Holland, Amsterdam), 619–729.Google Scholar
Ruszczyński A (2010) Risk-averse dynamic programming for Markov decision processes. Math. Programming 125(2):235–261.Crossref, Google Scholar
Skiena SS (2008) The Algorithm Design Manual (Springer, London).Crossref, Google Scholar
Tauchen G (1986) Finite state Markov-chain approximations to univariate and vector autoregressions. Econom. Lett. 20(2):177–181.Crossref, Google Scholar

Volume 69, Issue 5

September-October 2021

Pages ii-iv, 1349-1650, C2

Article Information

Metrics

Information

Received:January 24, 2019
Accepted:January 03, 2020
Published Online:February 08, 2021

Cite as

Qingyin Ma , John Stachurski (2021) Dynamic Programming Deconstructed: Transformations of the Bellman Equation and Computational Efficiency. Operations Research 69(5):1591-1607.

https://doi.org/10.1287/opre.2020.2006

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Dynamic Programming Deconstructed: Transformations of the Bellman Equation and Computational Efficiency

References

Volume 69, Issue 5

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News