On the Convergence of Policy Iteration in Stationary Dynamic Programming

Published Online:https://doi.org/10.1287/moor.4.1.60

The policy iteration method of dynamic programming is studied in an abstract setting. It is shown to be equivalent to the Newton-Kantorovich iteration procedure applied to the functional equation of dynamic programming. This equivalence is used to obtain the rate of convergence and error bounds for the sequence of values generated by policy iteration. These results are discussed in the context of the finite state Markovian decision problem with compact action space. An example is analyzed in detail.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.