Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem

Eric V. Denardo
Eric V. Denardo
Yale University, New Haven, Connecticut
Search for more papers by this author

Yale University, New Haven, Connecticut

Published Online:1 Apr 1970https://doi.org/10.1287/opre.18.2.279

Abstract

This paper treats a discrete-time Markov decision model with an infinite planning horizon and no discounting. A “bias-optimal” policy for this decision problem satisfies a criterion that is more selective than maximizing the gain rate. The problem of computing a bias-optimal policy, also treated by Veinott in 1966, is here parsed into a sequence of three simple Markov decision problems, each of which can be solved by linear programming or policy iteration.

Volume 18, Issue 2

March-April 1970

Pages 193-373

Article Information

Metrics

Information

Published Online:April 01, 1970

Cite as

Eric V. Denardo, (1970) Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem. Operations Research 18(2):279-289.

https://doi.org/10.1287/opre.18.2.279

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem

Abstract

Volume 18, Issue 2

Article Information

Metrics

Information

Cite as

Sign Up for INFORMS Publications Updates and News