How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?

Alfred Müller
Alfred Müller
Institut für Wirtschaftstheorie und Operations Research, Universität Karlsruhe, Kaiserstr. 12, D-76128 Karlsruhe, Germany
Search for more papers by this author

Institut für Wirtschaftstheorie und Operations Research, Universität Karlsruhe, Kaiserstr. 12, D-76128 Karlsruhe, Germany

Search for more papers by this author

Published Online:1 Nov 1997https://doi.org/10.1287/moor.22.4.872

Abstract

The present work deals with the comparison of (discrete time) Markov decision processes (MDPs), which differ only in their transition probabilities. We show that the optimal value function of an MDP is monotone with respect to appropriately defined stochastic order relations. We also find conditions for continuity with respect to suitable probability metrics. The results are applied to some well-known examples, including inventory control and optimal stopping.

cover image Mathematics of Operations Research

Volume 22, Issue 4

November 1997

Pages 769-1022

Article Information

Metrics

Information

Published Online:November 01, 1997

Cite as

Alfred Müller, (1997) How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?. Mathematics of Operations Research 22(4):872-885.

https://doi.org/10.1287/moor.22.4.872

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities?

Abstract

Volume 22, Issue 4

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News