Policy Bounds for Markov Decision Processes

William S. Lovejoy
William S. Lovejoy
Georgia Institute of Technology, Atlanta, Georgia
Search for more papers by this author

Georgia Institute of Technology, Atlanta, Georgia

Published Online:1 Aug 1986https://doi.org/10.1287/opre.34.4.630

Abstract

This paper demonstrates how a Markov decision process (MDP) can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states. We present sufficient conditions for several computationally attractive approximations to generate rigorous policy bounds. These approximations include approximating the optimal value function, replacing the original MDP with a separable approximate MDP, and approximating a stochastic MDP with its deterministic counterpart. An example from the field of fisheries management demonstrates the practical applicability of the results.

Volume 34, Issue 4

July-August 1986

Pages 501-653

Article Information

Metrics

Information

Published Online:August 01, 1986

Cite as

William S. Lovejoy, (1986) Policy Bounds for Markov Decision Processes. Operations Research 34(4):630-637.

https://doi.org/10.1287/opre.34.4.630

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Policy Bounds for Markov Decision Processes

Abstract

Volume 34, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News