Ordinal Dynamic Programming

Matthew J. Sobel
Matthew J. Sobel
Yale University
Search for more papers by this author

Yale University

Published Online:1 May 1975https://doi.org/10.1287/mnsc.21.9.967

Abstract

Numerically valued reward processes are found in most dynamic programming models. Mitten, however, recently formulated finite horizon sequential decision processes in which a real-valued reward need not be earned at each stage. Instead of the cardinality assumption implicit in past models, Mitten assumes that a decision maker has a preference order over a general collection of outcomes (which need not be numerically valued). This paper investigates infinite horizon ordinal dynamic programming models. Both deterministic and stochastic models are considered. It is shown that an optimal policy exists if and only if some stationary policy is optimal. Moreover, “policy improvement” leads to better policies using either Howard-Blackwell or Eaton-Zadeh procedures. The results illuminate the roles played by various sets of assumptions in the literature on Markovian decision processes.

Volume 21, Issue 9

May 1975

Pages 967-1086

Article Information

Metrics

Information

Published Online:May 01, 1975

Cite as

Matthew J. Sobel, (1975) Ordinal Dynamic Programming. Management Science 21(9):967-975.

https://doi.org/10.1287/mnsc.21.9.967

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Ordinal Dynamic Programming

Abstract

Volume 21, Issue 9

Article Information

Metrics

Information

Cite as

Sign Up for INFORMS Publications Updates and News