Markov Chain Design Problems
Abstract
System design problems using Markov chains and denoted Markov chain design problems are introduced. It is assumed that the chains of the problem have transition matrices and one period conditional expected rewards which are differentiable functions of a vector parameter. A gradient algorithm for maximizing the sum of the discounted expected reward or the limit of the one period expected reward is presented. The algorithm uses approximate objective function values and approximate gradients at each stage. Numerical work on a simple one dimensional queueing model solved the design problem correctly and quickly even though minimal approximation accuracy was used. The convergence of the approximation is proven for an appropriately converging sequence of design parameter values.

