A Note on Positive Dynamic Programming

Published Online:https://doi.org/10.1287/moor.11.2.383

This note considers total reward Markov decision processes with countable state space. For these models it is well known that in the positive case, i.e. the immediate reward function is nonnegative, without further conditions (1) the value iteration holds and (2) there exist pointwise good stationary strategies. Here we show that (1) remains true if the nonnegativity of the immediate reward function is replaced by the nonnegativity of the value function and that (2) remains true if there exists a strategy with nonnegative total rewards.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.