An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes

Published Online:https://doi.org/10.1287/ijoc.1050.0155

References

  • Barash D. A genetic search in policy space for solving Markov decision processes. AAAI Spring Sympos. Search Techniques Problem Solving Under Uncertainty and Incomplete Inform. (1999) Stanford University, Stanford, CAGoogle Scholar
  • Bentley J. Multidimensional binary search trees in database applications. IEEE Trans. Software Engrg. (1979) 5:333–340CrossrefGoogle Scholar
  • Bertsekas D. P.Dynamic Programming and Optimal Control (1995) (Athena Scientific, Belmont, MA) . Vols. 1 and 2Google Scholar
  • Bertsekas D. P., Castañon D. A. Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Trans. Automatic Control (1989) 34:589–598CrossrefGoogle Scholar
  • Chang H. S., Givan R. L., Chong E. K. P. Parallel rollout for online solution of partially observable Markov decision processes. Discrete Event Dynamic Systems: Theory Application (2004) 14:309–341CrossrefGoogle Scholar
  • Chang H. S., Lee H. G., Fu M. C., Marcus S. I. Evolutionary policy iteration for solving Markov decision processes. IEEE Trans. Automatic Control. (2005) 50:1804–1808CrossrefGoogle Scholar
  • de Farias D. P., Van Roy B. The linear programming approach to approximate dynamic programming. Oper. Res. (2003) 51:850–865LinkGoogle Scholar
  • Demmel J. W.Applied Numerical Linear Algebra (1997) (Soc. Indust. Appl. Math., Philadelphia, PA) CrossrefGoogle Scholar
  • Guttman A. R-trees: A dynamic index structure for spatial searching. Proc. 1984 Association for Computing Machinery Special Interest Group on Management of Data (1984) (ACM Press, New York) 47–57CrossrefGoogle Scholar
  • Lourenco H. R., Martin O. C., Stützle T., Glover F., Kochenberger G. Iterated local search. Handbook on MetaHeuristics (2002) (Kluwer Academic Publishers, Boston, MA) 321–353Google Scholar
  • MacQueen J. A modified dynamic programming method for Markovian decision problems. J. Math. Anal. Appl. (1966) 14:38–43CrossrefGoogle Scholar
  • Puterman M. L.Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (Wiley, New York) CrossrefGoogle Scholar
  • Rust J. Using randomization to break the curse of dimensionality. Econometrica (1997) 65:487–516CrossrefGoogle Scholar
  • Srinivas M., Patnaik L. M. Genetic algorithms: A survey. IEEE Comput. (1994) 27:17–26CrossrefGoogle Scholar
  • Trick M., Zin S. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynam. (1997) 1:255–277CrossrefGoogle Scholar
  • Tsitsiklis J. N., Van Roy B. Feature-based methods for large-scale dynamic programming. Machine Learning (1996) 22:59–94CrossrefGoogle Scholar
  • Wells C., Lusena C., Goldsmith J. Genetic algorithms for approximating solutions to POMDPs. (1999) . Department of Computer Science Technical Report TR-290-99, University of Kentucky, Lexington, KY, http://citeseer.ist.psu.edu/277136.htmlGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.