Robust Modified Policy Iteration

David L. Kaufman
David L. Kaufman
[email protected]
Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109
Search for more papers by this author
,
Andrew J. Schaefer
Andrew J. Schaefer
[email protected]
Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261
Search for more papers by this author

David L. Kaufman

[email protected]

Department of Industrial and Operations Engineering, University of Michigan, Ann Arbor, Michigan 48109

Search for more papers by this author

Andrew J. Schaefer

[email protected]

Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261

Search for more papers by this author

Published Online:6 Jun 2012https://doi.org/10.1287/ijoc.1120.0509

References

Bagnell J, Ng A, Schneider J. Solving uncertain Markov decision problems. (2001) . Technical report, Robotics Inst., CMU, 1–11Google Scholar
Bertsimas D, Iancu DA, Parrilo P. Optimality of affine policies in multistage robust optimization. Math Oper. Res. (2010) 35(2):363–394Link, Google Scholar
Bienstock D, Özbay N. Computing robust basestock levels. Discrete Optim. (2008) 5(2):389–414Crossref, Google Scholar
Delage E, Mannor S. Percentile optimization for Markov decision processes with parameter uncertainty. Oper. Res. (2010) 58(1):203–213Link, Google Scholar
Harmanec D. Generalizing Markov decision processes to imprecise probabilities. J. Statist. Planning and Inference (2002) 105(1):199–213Crossref, Google Scholar
Iyengar G. Robust dynamic programming. Math Oper. Res. (2005) 30(2):257–280Link, Google Scholar
Li B, Si J. Robust optimality for discounted infinite-horizon Markov decision processes with uncertain transition matrices. IEEE Trans. Automatic Control (2008) 53(9):112–2116Crossref, Google Scholar
Mannor S, Simester D, Sun P, Tsitsiklis JN. Bias and variance approximation in value function estimates. Management Sci. (2007) 53(2):308–322Link, Google Scholar
Martin JJ. Bayesian Decision Problems and Markov Chains (1967) (John Wiley and Sons, Inc., New York) No. 13 in Publications in Operations ResearchGoogle Scholar
Nilim A, El Ghaoui L. Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. (2005) 53(5):780–798Link, Google Scholar
Puterman ML. Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (John Wiley and Sons, Inc., New York) Wiley Series in Probability and Mathematical StatisticsCrossref, Google Scholar
Puterman ML, Shin MC. Modified policy iteration algorithms for discounted Markov decision problems. Management Sci. (1978) 24(11):1127–1137Link, Google Scholar
Satia JK, Lave RE. Markov decision processes with uncertain transition probabilities. Oper. Res. (1973) 21(3):728–740Link, Google Scholar
Scarf HE. A min-max solution of an inventory problem. Studies in the Mathematical Theory of Inventory and Production (1958) (Stanford University Press, Stanford, CA) Google Scholar
See C-T, Sim M. Robust approximation to multi-period inventory management. Oper. Res. (2010) 58(3):583–594Link, Google Scholar
Silver EA. Markov decision processes with uncertain transition probabilities or rewards. (1963) . Technical Report 1, Research in the Control of Complex Systems, Operations Research Center, Massachusetts Institute of Technology, Cambridge, MAGoogle Scholar
Singh SP, Gullapalli V. Asynchronous modified policy iteration with single-sided updates. (1993) . Unpublished manuscript. 1 Feb. 2011. http://www.eecs.umich.edu/~baveja/papers/single-sided.ps.gzGoogle Scholar
Wee KE, Dada M. Optimal policies for transshipping inventory in a retail network. Management Sci. (2005) 51(10):1519–1533Link, Google Scholar
White CC, Eldeib HK. Markov decision processes with imprecise transition probabilities. Oper. Res. (1994) 42(4):739–749Link, Google Scholar
White CC, Thomas LC, Scherer WT. Reward revision for discounted Markov decision problems. Oper. Res. (1985) 33(6):1299–1315Link, Google Scholar
Williams RJ, Baird LC. Analysis of some incremental variants of policy iteration: First steps towards understanding actor-critic learning systems. (1993) . Technical Report NU-CCS-93-11, Northeastern University, College of Computer Science, Boston, MA, 02115Google Scholar

cover image INFORMS Journal on Computing

Volume 25, Issue 3

Summer 2013

Pages 395-598

Article Information

Metrics

Information

Received:February 01, 2010
Accepted:January 01, 2012
Published Online:June 06, 2012

Cite as

David L. Kaufman, Andrew J. Schaefer, (2012) Robust Modified Policy Iteration. INFORMS Journal on Computing 25(3):396-410.

https://doi.org/10.1287/ijoc.1120.0509

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Robust Modified Policy Iteration

References

Volume 25, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News