An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes

Jiaqiao Hu
Jiaqiao Hu
[email protected]
Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, New York 11794, USA
Search for more papers by this author
,
Michael C. Fu
Michael C. Fu
[email protected]
Robert H. Smith School of Business and Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
Search for more papers by this author
,
Vahid R. Ramezani
Vahid R. Ramezani
[email protected]
Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
Search for more papers by this author
,
Steven I. Marcus
Steven I. Marcus
[email protected]
Department of Electrical and Computer Engineering, and Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA
Search for more papers by this author

Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, New York 11794, USA

Search for more papers by this author

Michael C. Fu

[email protected]

Robert H. Smith School of Business and Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA

Search for more papers by this author

Vahid R. Ramezani

[email protected]

Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA

Search for more papers by this author

Steven I. Marcus

[email protected]

Department of Electrical and Computer Engineering, and Institute for Systems Research, University of Maryland, College Park, Maryland 20742, USA

Search for more papers by this author

Published Online:1 May 2007https://doi.org/10.1287/ijoc.1050.0155

References

Barash D. A genetic search in policy space for solving Markov decision processes. AAAI Spring Sympos. Search Techniques Problem Solving Under Uncertainty and Incomplete Inform. (1999) Stanford University, Stanford, CAGoogle Scholar
Bentley J. Multidimensional binary search trees in database applications. IEEE Trans. Software Engrg. (1979) 5:333–340Crossref, Google Scholar
Bertsekas D. P.Dynamic Programming and Optimal Control (1995) (Athena Scientific, Belmont, MA) . Vols. 1 and 2Google Scholar
Bertsekas D. P., Castañon D. A. Adaptive aggregation methods for infinite horizon dynamic programming. IEEE Trans. Automatic Control (1989) 34:589–598Crossref, Google Scholar
Chang H. S., Givan R. L., Chong E. K. P. Parallel rollout for online solution of partially observable Markov decision processes. Discrete Event Dynamic Systems: Theory Application (2004) 14:309–341Crossref, Google Scholar
Chang H. S., Lee H. G., Fu M. C., Marcus S. I. Evolutionary policy iteration for solving Markov decision processes. IEEE Trans. Automatic Control. (2005) 50:1804–1808Crossref, Google Scholar
de Farias D. P., Van Roy B. The linear programming approach to approximate dynamic programming. Oper. Res. (2003) 51:850–865Link, Google Scholar
Demmel J. W.Applied Numerical Linear Algebra (1997) (Soc. Indust. Appl. Math., Philadelphia, PA) Crossref, Google Scholar
Guttman A. R-trees: A dynamic index structure for spatial searching. Proc. 1984 Association for Computing Machinery Special Interest Group on Management of Data (1984) (ACM Press, New York) 47–57Crossref, Google Scholar
Lourenco H. R., Martin O. C., Stützle T., Glover F., Kochenberger G. Iterated local search. Handbook on MetaHeuristics (2002) (Kluwer Academic Publishers, Boston, MA) 321–353Google Scholar
MacQueen J. A modified dynamic programming method for Markovian decision problems. J. Math. Anal. Appl. (1966) 14:38–43Crossref, Google Scholar
Puterman M. L.Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (Wiley, New York) Crossref, Google Scholar
Rust J. Using randomization to break the curse of dimensionality. Econometrica (1997) 65:487–516Crossref, Google Scholar
Srinivas M., Patnaik L. M. Genetic algorithms: A survey. IEEE Comput. (1994) 27:17–26Crossref, Google Scholar
Trick M., Zin S. Spline approximations to value functions: A linear programming approach. Macroeconomic Dynam. (1997) 1:255–277Crossref, Google Scholar
Tsitsiklis J. N., Van Roy B. Feature-based methods for large-scale dynamic programming. Machine Learning (1996) 22:59–94Crossref, Google Scholar
Wells C., Lusena C., Goldsmith J. Genetic algorithms for approximating solutions to POMDPs. (1999) . Department of Computer Science Technical Report TR-290-99, University of Kentucky, Lexington, KY, http://citeseer.ist.psu.edu/277136.htmlGoogle Scholar

cover image INFORMS Journal on Computing

Volume 19, Issue 2

Spring 2007

Pages 149-312

Article Information

Metrics

Information

Received:April 01, 2004
Accepted:June 01, 2005
Published Online:May 01, 2007

Cite as

Jiaqiao Hu, Michael C. Fu, Vahid R. Ramezani, Steven I. Marcus, (2007) An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes. INFORMS Journal on Computing 19(2):161-174.

https://doi.org/10.1287/ijoc.1050.0155

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes

References

Volume 19, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News