An Algorithm to Identify and Compute Average Optimal Policies in Multichain Markov Decision Processes
Published Online:1 Aug 2003https://doi.org/10.1287/moor.28.3.553.16388
References
- Optimal decision procedures for finite Markov chains, II: Communicating systems. Adv. Appl. Probab. (1973a) 5:521–540Crossref, Google Scholar
- Optimal decision procedures for finite Markov chains, III: General convex systems. Adv. Appl. Probab. (1973b) 5:541–553Crossref, Google Scholar
- Nonnegative Matrices in the Mathematical Sciences (1979) (Academic Press, New York) Google Scholar
- Discrete dynamic programming. Ann. Math. Statist. (1962) 33:719–726Crossref, Google Scholar
- On minimum cost per unit time control of Markov chains. SIAM J. Control Optim. (1984) 22:965–984Crossref, Google Scholar
- Control of Markov chains with long-run average cost criterion: The dynamic programming equations. SIAM J. Control Optim. (1989) 27:642–657Crossref, Google Scholar
- Multichain Markov renewal programs. SIAM J. Appl. Math. (1968) 16:468–487Crossref, Google Scholar
- Overtaking optimality for Markov decision chains. Math. Oper. Res. (1979) 4:144–152Link, Google Scholar
- Finite States Markovian Decision Processes (1970) (Academic Press, New York) Google Scholar
- Controlled Markov Processes (1979) (Springer-Verlag, New York) Crossref, Google Scholar
- , Puterman M. L. Discounted and undiscounted value-iteration in Markov decision problems. A Survey in Dynamic Programming and Its Applications (1978) (Academic Press, New York) 23–52Crossref, Google Scholar
- , Hartley R., Thomas L. C., White D. J. A survey of asymptotic value-iteration for undiscounted Markovian decision processes. Recent Developments in Markov Decision Processes (1980) (Academic Press, New York) 73–109Google Scholar
- A fixed-point approach to undiscounted Markov renewal programs. SIAM J. Algebraic Discrete Methods (1984) 5:539–550Crossref, Google Scholar
- Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math. Oper. Res. (1983) 8:298–313Link, Google Scholar
- The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms. J. Appl. Probab. (1978) 15:356–373Crossref, Google Scholar
- On controlled finite state Markov processes with compact control sets. Theor. Probab. Appl. (1975) 20:856–861Crossref, Google Scholar
- On stationary strategies Borel dynamic programming. Math. Oper. Res. (1992) 17:392–397Link, Google Scholar
- Competitive Markov Decision Processes (1997) (Springer-Verlag, New York) Google Scholar
- An improved algorithm for solving communicating average reward Markov decision processes. Ann. Oper. Res. (1991) 28:229–242Crossref, Google Scholar
- Stochastic Models in Operations Research, Vol. II: Stochastic Optimization (1984) (McGraw-Hill, New York) Google Scholar
- Foundations of Non-Stationary Dynamic Programming with Discrete-Time Parameter (1970) (Springer-Verlag, New York) Lecture Notes in Operations Research 33Crossref, Google Scholar
- Dynamic programming and Markov potential theory. Math. Centre Tracts (1974) 51(Amsterdam)Google Scholar
- Constrained undiscounted stochastic dynamic programming. Math. Oper. Res. (1984) 9:276–289Link, Google Scholar
- On the convergence of policy iteration in undiscounted finite state Markov processes: The unichain case. Math. Oper. Res. (1987) 12:163–176Link, Google Scholar
- Overtaking and almost-sure optimality for infinite horizon Markov decision processes. Math. Oper. Res. (1996) 21:158–181Link, Google Scholar
- Existence of stationary control for a Markov chain maximizing the average reward. Oper. Res. (1967) 15:866–871Link, Google Scholar
- Improved conditions for convergence in undiscounted Markov renewal programming. Oper. Res. (1977) 25:529–533Link, Google Scholar
- Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994) (John Wiley and Sons, Inc., New York) Crossref, Google Scholar
- Multichain Markov decision processes with a sample-path constraint: A decomposition approach. Math. Oper. Res. (1991) 16:195–217Link, Google Scholar
- Multiplicative Markov decison chains. Math. Oper. Res. (1984) 9:6–24Link, Google Scholar
- Growth optimality for branching Markov decision chains. Math. Oper. Res. (1982) 7:582–601Link, Google Scholar
- Conditions for optimality in dynamic programming and for the limit of n-stage optimal policy to be optimal. Z. Wahrch. verw. Gebiete (1975) 32:179–196Crossref, Google Scholar
- On the second optimality equation for semi-Markov decision models. Math. Oper. Res. (1992) 17:470–486Link, Google Scholar
- On the solvability of Bellman's functional equations for Markov renewal programs. J. Math. Anal. Appl. (1983) 96:13–23Crossref, Google Scholar
- A Brouwer fixed-point mapping approach to communicating Markov decision processes. J. Math. Anal. Appl. (1987) 123:117–130Crossref, Google Scholar
- The functional equations of undiscounted Markov renewal programming. Math. Oper. Res. (1978) 3:308–321Link, Google Scholar
- Non-Negative Matrices and Markov Chains (1981) (Springer-Verlag, New York) Crossref, Google Scholar
- Negative dynamic programming. Ann. Math. Statist. (1966) 37:871–890Crossref, Google Scholar
- On a class of policies in general Markov decision models. Theory Probabil. Appl. (1973) 18:777–779Crossref, Google Scholar
- Stochastic Dynamic Programming (1981) (The Mathematical Centre, Amsterdam, The Netherlands) Google Scholar
- Reliability and performance models for reconfigurable computer systems. (1987) . Ph.D. thesis, University of Pennsylvania, Philadelphia, PAGoogle Scholar

