Markov Strategies in Dynamic Programming
Abstract
It is shown that the supremum of the expected total return over the Markov strategies equals the supremum over all strategies. The model assumptions are: the state space is countable, the action space is measurable and the supremum of the expected total of the positive rewards over the Markov strategies is finite.

