Existence of a Stationary Control for a Markov Chain Maximizing the Average Reward

Anders Martin-Löf
Anders Martin-Löf
The Royal Institute of Technology, Stockholm, Sweden
Search for more papers by this author

The Royal Institute of Technology, Stockholm, Sweden

Published Online:1 Oct 1967https://doi.org/10.1287/opre.15.5.866

Abstract

The problem of optimal control of a discrete time stationary Markov chain with complete state information has been considered by many authors. The case with finitely many states and controls has been thoroughly investigated. Chains with infinitely many states or controls have also been considered with various assumptions concerning the reward function. In this paper the existence of a control maximizing the average reward is established for Markov chains with a finite number of states and an arbitrary compact set of possible actions in each state. It is assumed that there is only one ergodic class and no transient states in the chain for every control. The method of proof uses methods from convex programming, and is analogous to the linear programming approach used by Wolfe and Danzig.

Volume 15, Issue 5

September-October 1967

Pages 779-983

Article Information

Metrics

Information

Published Online:October 01, 1967

Cite as

Anders Martin-Löf, (1967) Existence of a Stationary Control for a Markov Chain Maximizing the Average Reward. Operations Research 15(5):866-871.

https://doi.org/10.1287/opre.15.5.866

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Existence of a Stationary Control for a Markov Chain Maximizing the Average Reward

Abstract

Volume 15, Issue 5

Article Information

Metrics

Information

Cite as

Sign Up for INFORMS Publications Updates and News