Discounted Markov Programming in a Periodic Process

Published Online:https://doi.org/10.1287/opre.13.6.920

This paper deals with a nonstationary discrete-time Markov process whose transition probabilities vary periodically in time. Each transition results in a reward that varies within the same cycle as the transition matrix. For infinite processes a policy-iteration algorithm is developed that effectively determines an optimal policy maximizing the total discounted reward. The paper represents an extension of R. A. Howard's policy-iteration technique for stationary Markov processes. A numerical example is given in which the developed iteration algorithm is demonstrated.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.