On Average Reward Semi-Markov Decision Processes with a General Multichain Structure

L. Jianyong
L. Jianyong
[email protected]
Institute of Applied Mathematics, Academia Sinica, Beijing 100080, China
Search for more papers by this author
,
Z. Xiaobo
Z. Xiaobo
[email protected]
Department of Industrial Engineering, Tsinghua University, Beijing 100084, China
Search for more papers by this author

L. Jianyong

[email protected]

Institute of Applied Mathematics, Academia Sinica, Beijing 100080, China

Search for more papers by this author

Z. Xiaobo

[email protected]

Department of Industrial Engineering, Tsinghua University, Beijing 100084, China

Search for more papers by this author

Published Online:1 May 2004https://doi.org/10.1287/moor.1030.0077

References

Arapostathis A., Borkar V. S., Fernandez-Gaucherand E., Ghosh M. K., Marcus S. I. Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Contin. Optim. (1993) 31(2):282–344Crossref, Google Scholar
Beutler F. J., Ross K. W. Time-average optimal constrained semi-Markov decision processes. Adv. Appl. Probab. (1986) 18:341–359Crossref, Google Scholar
Beutler F. J., Ross K. W. Uniformization for semi-Markov decision processes under stationary policies. J. Appl. Probab. (1987) 24:644–656Crossref, Google Scholar
Das T. K., Gosavi A., Mahadevan S., Marchalleck N. Solving semi-Markov decision problems using average reward reinforcement learning. Management Sci. (1999) 45(4):560–574Link, Google Scholar
DeCani J. S. A dynamic programming algorithm for embedded Markov chains when the planning horizon is at infinity. Management Sci. (1964) 10:716–733Link, Google Scholar
Denardo E. V., Fox B. L. Multichain Markov renewal programs. SIAM J. Appl. Math. (1968) 16(3):468–487Crossref, Google Scholar
Deppe H. On the existence of average optimal policies in semi-regenerative decision models. Math. Oper. Res. (1984) 9(4):558–575Link, Google Scholar
Derman C.Finite state Markovian Decision Processes (1970) (Academic, New York) Google Scholar
Dynkin E. B., Yushkevich A. A.Controlled Markov Processes (1979) (Springer-Verlag, New York) Crossref, Google Scholar
Federgruen A., Tijms H. C. The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms. J. Appl. Probab. (1978) 15:356–373Crossref, Google Scholar
Federgruen A., Schweitzer P. J. A fixed point approach to undiscounted Markov renewal programs. SIAM J. Algebra Discrete Math. (1984) 5(4):539–550Crossref, Google Scholar
Federgruen A., Hordijk A., Tijms H. C. Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion. Stochastic Proc. Appl. (1979) 9:223–235Crossref, Google Scholar
Federgruen A., Schweitzer P. J., Tijms H. C. Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math. Oper. Res. (1983) 8(2):298–313Link, Google Scholar
Fox B. Markov renewal programming by linear fractional programming. SIAM J. Appl. Math. (1966) 14(6):1418–1432Crossref, Google Scholar
Fox B. Existence of stationary optimal policies for some Markov renewal programs. SIAM Rev. (1967) 9(3):573–576Crossref, Google Scholar
Hinderer K., Waldmann K-H. Approximate solution of Markov renewal programs with finite time horizon. SIAM J. Control Optim. (1998) 37(2):502–520Crossref, Google Scholar
Howard R. A. Semi-Markovian decision processes. Proc. Internat. Statist. Inst. (1963) Ottawa, CanadaGoogle Scholar
Hu Q., Liu J. An introduction to Markov decision processes (Chinese). (2000) (Xidian University, Xian, China) Google Scholar
Jewell W. S. Markov-renewal programming. I: Formulation finite return models. Oper. Res. (1963a) 11:938–948Link, Google Scholar
Jewell W. S. Markov-renewal programming. II: Infinite return models example. Oper. Res. (1963b) 11:949–971Link, Google Scholar
Kallenberg L. C. M. Linear programming and finite Markov control problems. (1983) (Math. Centre, Amsterdam, The Netherlands) Google Scholar
Lippman S. A. Maximal average-reward policies for semi-Markov decision processes with arbitrary state and action space. Ann. Math. Statist. (1971) 42(5):1717–1726Crossref, Google Scholar
Mann E., Janssen J. The functional equations of undiscounted denumerable state Markov renewal programming. Semi-Markov Model (1986) (Plenum, New York) 79–96Crossref, Google Scholar
Platzman L. Improved conditions for convergence in undiscounted Markov renewal programming. Oper. Res. (1977) 25:529–533Link, Google Scholar
Puterman M. L.Markov Decision Processes: Discrete Stochastic Dynam. Programming (1994) (John Wiley and Sons, New York) Crossref, Google Scholar
Ross S. M.Applied Probability Models with Optimization Applications (1970) (Holden Day, San Francisco) Google Scholar
Schal M. On the second optimality equation for semi-Markov decision models. Math. Oper. Res. (1992) 17(2):470–486Link, Google Scholar
Schweitzer P. J. Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. (1971) 34:495–501Crossref, Google Scholar
Schweitzer P. J. A value-iteration scheme for undiscounted multichain Markov renewal programs. Zeitschrift Oper. Res. (1984) 28:143–152Crossref, Google Scholar
Schweitzer P. J. Iterative bounds on the relative value vector in undiscounted Markov renewal programming. Zeitschrift Oper. Res. (1985) 29:269–284Crossref, Google Scholar
Schweitzer P. J., Federgruen A. The functional equations of undiscounted Markov renewal programming. Math. Oper. Res. (1978) 3(4):308–321Link, Google Scholar
Sennott L. I. Average cost semi-Markov decision processes and the control of queueing systems. Probab. Engrg. Inform. Sci. (1989) 3:247–272Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 29, Issue 2

May 2004

Pages 191-406

Article Information

Metrics

Information

Received:May 29, 2002
Published Online:May 01, 2004

Cite as

L. Jianyong, Z. Xiaobo, (2004) On Average Reward Semi-Markov Decision Processes with a General Multichain Structure. Mathematics of Operations Research 29(2):339-352.

https://doi.org/10.1287/moor.1030.0077

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On Average Reward Semi-Markov Decision Processes with a General Multichain Structure

References

Volume 29, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News