Robust Modified Policy Iteration
Published Online:6 Jun 2012https://doi.org/10.1287/ijoc.1120.0509
References
- . Solving uncertain Markov decision problems. (2001) . Technical report, Robotics Inst., CMU, 1–11Google Scholar
- . Optimality of affine policies in multistage robust optimization. Math Oper. Res. (2010) 35(2):363–394Link, Google Scholar
- . Computing robust basestock levels. Discrete Optim. (2008) 5(2):389–414Crossref, Google Scholar
- . Percentile optimization for Markov decision processes with parameter uncertainty. Oper. Res. (2010) 58(1):203–213Link, Google Scholar
- . Generalizing Markov decision processes to imprecise probabilities. J. Statist. Planning and Inference (2002) 105(1):199–213Crossref, Google Scholar
- . Robust dynamic programming. Math Oper. Res. (2005) 30(2):257–280Link, Google Scholar
- . Robust optimality for discounted infinite-horizon Markov decision processes with uncertain transition matrices. IEEE Trans. Automatic Control (2008) 53(9):112–2116Crossref, Google Scholar
- . Bias and variance approximation in value function estimates. Management Sci. (2007) 53(2):308–322Link, Google Scholar
- . Bayesian Decision Problems and Markov Chains (1967) (John Wiley and Sons, Inc., New York) No. 13 in Publications in Operations ResearchGoogle Scholar
- . Robust control of Markov decision processes with uncertain transition matrices. Oper. Res. (2005) 53(5):780–798Link, Google Scholar
- . Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (John Wiley and Sons, Inc., New York) Wiley Series in Probability and Mathematical StatisticsCrossref, Google Scholar
- . Modified policy iteration algorithms for discounted Markov decision problems. Management Sci. (1978) 24(11):1127–1137Link, Google Scholar
- . Markov decision processes with uncertain transition probabilities. Oper. Res. (1973) 21(3):728–740Link, Google Scholar
- . A min-max solution of an inventory problem. Studies in the Mathematical Theory of Inventory and Production (1958) (Stanford University Press, Stanford, CA) Google Scholar
- . Robust approximation to multi-period inventory management. Oper. Res. (2010) 58(3):583–594Link, Google Scholar
- . Markov decision processes with uncertain transition probabilities or rewards. (1963) . Technical Report 1, Research in the Control of Complex Systems, Operations Research Center, Massachusetts Institute of Technology, Cambridge, MAGoogle Scholar
- . Asynchronous modified policy iteration with single-sided updates. (1993) . Unpublished manuscript. 1 Feb. 2011. http://www.eecs.umich.edu/~baveja/papers/single-sided.ps.gzGoogle Scholar
- . Optimal policies for transshipping inventory in a retail network. Management Sci. (2005) 51(10):1519–1533Link, Google Scholar
- . Markov decision processes with imprecise transition probabilities. Oper. Res. (1994) 42(4):739–749Link, Google Scholar
- . Reward revision for discounted Markov decision problems. Oper. Res. (1985) 33(6):1299–1315Link, Google Scholar
- . Analysis of some incremental variants of policy iteration: First steps towards understanding actor-critic learning systems. (1993) . Technical Report NU-CCS-93-11, Northeastern University, College of Computer Science, Boston, MA, 02115Google Scholar

