Technical Note—The Method of Successive Approximations and Markovian Decision Problems

Published Online:https://doi.org/10.1287/opre.22.3.519

This note considers Howard's discrete-time Markovian decision model with the average return as criterion. Using results of Blackwell and MacQueen for the discounted return model it is shown in all generality that the Odoni bounds contain both the maximal average return and the average return of the current policy.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.