The Parameter Iteration Method in Dynamic Programming

Published Online:https://doi.org/10.1287/mnsc.35.6.675

Many practical problems involve making optimal decisions for systems with state characterized by many components. These problems lead to dynamic programming problems with a very large number of state variables. Thus, an exact derivation of the optimal policy for such problems is not feasible to solve numerically due to the great amount of computer time and storage involved.

This paper presents a practical method, denoted as the Parameter Iteration Method, for obtaining an approximate solution for the above described problem. The computational difficulty caused by the tremendously large dimensionality of the state variable is overcome by means of an iterative method which combines simulation and recursive estimation to compute successive approximations of the value function.

The implementation of the Parameter Iteration Method is illustrated for the problem of optimal replacement policy for a multi-item Markovian system.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.