Optimum Policy Regions for Markov Processes with Discounting

Published Online:https://doi.org/10.1287/opre.14.4.658

In many practical situations the discount factor for future rewards and costs is not known precisely. In the modeling of such situations, this is often reflected in a dependence of the optimum policy on the discount factor. We discuss this dependence of the optimum policy on discount factor for the class of finite-state, time-invariant, Markov models. A procedure is developed for finding the value of the discount factor for which we are indifferent between two policies. This is then extended to a discussion of how we can find the complete description of the optimum policy regions over any range of the discount factor. Two examples are presented.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.