Stationary Policies in Dynamic Programming Models Under Compactness Assumptions

Published Online:https://doi.org/10.1287/moor.8.3.366

The present work deals with the usual stationary decision model of dynamic programming. The imposed convergence condition on the expected total rewards is so general that both the negative (unbounded) case and the positive (unbounded) case are included. However, the gambling model studied by Dubins and Savage is not covered by the present model.

In addition to the convergence condition, a continuity and compactness condition is imposed. The main result states that the supremum of the expected total rewards under all stationary policies is equal to the supremum under all (possibly randomized and non-Markovian) policies.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.