Markov Strategies in Dynamic Programming

Published Online:https://doi.org/10.1287/moor.3.1.37

It is shown that the supremum of the expected total return over the Markov strategies equals the supremum over all strategies. The model assumptions are: the state space is countable, the action space is measurable and the supremum of the expected total of the positive rewards over the Markov strategies is finite.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.