An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes

Arie Leizarowitz
Arie Leizarowitz
[email protected]
Department of Mathematics, Technion, Technion City, Haifa 32000, Israel
Search for more papers by this author

Department of Mathematics, Technion, Technion City, Haifa 32000, Israel

Published Online:1 Aug 2003https://doi.org/10.1287/moor.28.3.553.16388

Abstract

This paper concerns discrete-time, finite state multichain MDPs with compact action sets. The optimality criterion is long-run average cost. Simple examples illustrate that optimal stationary Markov policies do not always exist. We establish the existence of ε-optimal policies that are stationary Markovian, and develop an algorithm that computes these approximate optimal policies. We establish a necessary and sufficient condition for the existence of an optimal policy that is stationary Markovian, and in case that such an optimal policy exists the algorithm computes it.

cover image Mathematics of Operations Research

Volume 28, Issue 3

August 2003

Pages 395-608

Article Information

Metrics

Information

Received:October 06, 1998
Published Online:August 01, 2003

Cite as

Arie Leizarowitz, (2003) An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes. Mathematics of Operations Research 28(3):553-586.

https://doi.org/10.1287/moor.28.3.553.16388

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes

Abstract

Volume 28, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News