Markov Chain Design Problems

Published Online:https://doi.org/10.1287/opre.29.5.959

System design problems using Markov chains and denoted Markov chain design problems are introduced. It is assumed that the chains of the problem have transition matrices and one period conditional expected rewards which are differentiable functions of a vector parameter. A gradient algorithm for maximizing the sum of the discounted expected reward or the limit of the one period expected reward is presented. The algorithm uses approximate objective function values and approximate gradients at each stage. Numerical work on a simple one dimensional queueing model solved the design problem correctly and quickly even though minimal approximation accuracy was used. The convergence of the approximation is proven for an appropriately converging sequence of design parameter values.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.