Variability Sensitive Markov Decision Processes

Melike Baykal-Gürsoy
Melike Baykal-Gürsoy
Industrial Engineering Department, Rutgers University, Piscataway, New Jersey 08854
Search for more papers by this author
,
Keith W. Ross
Keith W. Ross
Department of Systems, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Search for more papers by this author

Melike Baykal-Gürsoy

Industrial Engineering Department, Rutgers University, Piscataway, New Jersey 08854

Search for more papers by this author

Keith W. Ross

Department of Systems, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Search for more papers by this author

Published Online:1 Aug 1992https://doi.org/10.1287/moor.17.3.558

Abstract

Considered are time-average Markov Decision Processes (MDPs) with finite state and action spaces. Two definitions of variability are introduced, namely, the expected time-average variability and time-average expected variability. The two criteria are in general different, although they can both be employed to penalize for variance in the stream of rewards. For communicating MDPs, we construct a (randomized) stationary policy that is ε-optimal for both criteria; the policy is optimal and pure for a specific variability function. For general multichain MDPs, a state space decomposition leads to a similar result for the expected time-average variability. We also consider the problem of the decision maker choosing the initial state along with the policy.

cover image Mathematics of Operations Research

Volume 17, Issue 3

August 1992

Pages 509-764

Article Information

Metrics

Information

Published Online:August 01, 1992

Cite as

Melike Baykal-Gürsoy, Keith W. Ross, (1992) Variability Sensitive Markov Decision Processes. Mathematics of Operations Research 17(3):558-571.

https://doi.org/10.1287/moor.17.3.558

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Variability Sensitive Markov Decision Processes

Abstract

Volume 17, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News