Optimum Policy Regions for Markov Processes with Discounting
Abstract
In many practical situations the discount factor for future rewards and costs is not known precisely. In the modeling of such situations, this is often reflected in a dependence of the optimum policy on the discount factor. We discuss this dependence of the optimum policy on discount factor for the class of finite-state, time-invariant, Markov models. A procedure is developed for finding the value of the discount factor for which we are indifferent between two policies. This is then extended to a discussion of how we can find the complete description of the optimum policy regions over any range of the discount factor. Two examples are presented.

