On the Complexity of the Policy Improvement Algorithm for Markov Decision Processes
Abstract
We consider the complexity of the policy improvement algorithm for Markov decision processes. We show that four variants of the algorithm require exponential time in the worst case.
INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.