A Dynamic Programming Algorithm for Embedded Markov Chains when the Planning Horizon is at Infinity

Published Online:https://doi.org/10.1287/mnsc.10.4.716

This paper presents an algorithm for the solution of dynamic programming problems requiring the determination of optimal policies for the control of a special class of stochastic processes when the time horizon of the planning period is at infinity. These processes can be mathematically described as discrete time parameter Markov chains with a finite number of states which have been “embedded” in continuous time in the sense that the time between transitions is a random variable whose probability distribution depends only on the states between which the transition takes place. Such processes are called Markov-renewal processes. The Markov processes considered by R. A. Howard in [1] are really two special cases of this somewhat wider class of stochastic processes. In these two special cases, the algorithm of this paper is identical with Howard's. In fact, with only slight modification, Howard's algorithm can be extended to this wider class of stochastic processes.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.