Forward Recursion for Markov Decision Processes with Skip-Free-to-the-Right Transitions, Part I: Theory and Algorithm

Jacob Wijngaard
Jacob Wijngaard
Department of Industrial Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Search for more papers by this author
,
Shaler Stidham, Jr.
Shaler Stidham, Jr.
Department of Industrial Engineering, North Carolina State University, Box 7906, Raleigh, North Carolina 27695-7906
Search for more papers by this author

Jacob Wijngaard

Department of Industrial Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands

Search for more papers by this author

Shaler Stidham, Jr.

Department of Industrial Engineering, North Carolina State University, Box 7906, Raleigh, North Carolina 27695-7906

Search for more papers by this author

Published Online:1 May 1986https://doi.org/10.1287/moor.11.2.295

Abstract

We consider a Markovian decision process with countable state space (states 0, 1, 2, …) which is skip-free to the right (a transition from i to j is impossible if j > i + 1). In this type of system it is easy to calculate by forward recursion the maximal total expected reward going from state 0 to state i; the same can be done, of course, for the case where a constant g is subtracted from the one-period reward function (g-revised reward). Let −w^g(i) be the maximal total expected g-revised reward going from state 0 to state i. We show that w^g(·) satisfies the average-reward optimality equation. If w^g(·) satisfies a growth condition, then g = g*, the maximal average reward. For all other g, the function w^g increases or decreases so fast that this cannot be the case. Thus, in principle the solution w^g can be used to check if g < g* or g > g*, which suggests a method for approximating g* and an associated average-return optimal policy. We develop an efficient algorithm based on this idea. In a companion paper we shall show how the algorithm, or modifications of it, can be applied to some special cases, such as control of arrivals to a queue, control of the service rate, and controlled random walks.

cover image Mathematics of Operations Research

Volume 11, Issue 2

May 1986

Pages 193-384

Article Information

Metrics

Information

Published Online:May 01, 1986

Cite as

Jacob Wijngaard, Shaler Stidham, Jr., (1986) Forward Recursion for Markov Decision Processes with Skip-Free-to-the-Right Transitions, Part I: Theory and Algorithm. Mathematics of Operations Research 11(2):295-308.

https://doi.org/10.1287/moor.11.2.295

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Forward Recursion for Markov Decision Processes with Skip-Free-to-the-Right Transitions, Part I: Theory and Algorithm

Abstract

Volume 11, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News