On Average Reward Semi-Markov Decision Processes with a General Multichain Structure

Published Online:https://doi.org/10.1287/moor.1030.0077

References

  • Arapostathis A., Borkar V. S., Fernandez-Gaucherand E., Ghosh M. K., Marcus S. I. Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Contin. Optim. (1993) 31(2):282–344CrossrefGoogle Scholar
  • Beutler F. J., Ross K. W. Time-average optimal constrained semi-Markov decision processes. Adv. Appl. Probab. (1986) 18:341–359CrossrefGoogle Scholar
  • Beutler F. J., Ross K. W. Uniformization for semi-Markov decision processes under stationary policies. J. Appl. Probab. (1987) 24:644–656CrossrefGoogle Scholar
  • Das T. K., Gosavi A., Mahadevan S., Marchalleck N. Solving semi-Markov decision problems using average reward reinforcement learning. Management Sci. (1999) 45(4):560–574LinkGoogle Scholar
  • DeCani J. S. A dynamic programming algorithm for embedded Markov chains when the planning horizon is at infinity. Management Sci. (1964) 10:716–733LinkGoogle Scholar
  • Denardo E. V., Fox B. L. Multichain Markov renewal programs. SIAM J. Appl. Math. (1968) 16(3):468–487CrossrefGoogle Scholar
  • Deppe H. On the existence of average optimal policies in semi-regenerative decision models. Math. Oper. Res. (1984) 9(4):558–575LinkGoogle Scholar
  • Derman C.Finite state Markovian Decision Processes (1970) (Academic, New York) Google Scholar
  • Dynkin E. B., Yushkevich A. A.Controlled Markov Processes (1979) (Springer-Verlag, New York) CrossrefGoogle Scholar
  • Federgruen A., Tijms H. C. The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms. J. Appl. Probab. (1978) 15:356–373CrossrefGoogle Scholar
  • Federgruen A., Schweitzer P. J. A fixed point approach to undiscounted Markov renewal programs. SIAM J. Algebra Discrete Math. (1984) 5(4):539–550CrossrefGoogle Scholar
  • Federgruen A., Hordijk A., Tijms H. C. Denumerable state semi-Markov decision processes with unbounded costs, average cost criterion. Stochastic Proc. Appl. (1979) 9:223–235CrossrefGoogle Scholar
  • Federgruen A., Schweitzer P. J., Tijms H. C. Denumerable undiscounted semi-Markov decision processes with unbounded rewards. Math. Oper. Res. (1983) 8(2):298–313LinkGoogle Scholar
  • Fox B. Markov renewal programming by linear fractional programming. SIAM J. Appl. Math. (1966) 14(6):1418–1432CrossrefGoogle Scholar
  • Fox B. Existence of stationary optimal policies for some Markov renewal programs. SIAM Rev. (1967) 9(3):573–576CrossrefGoogle Scholar
  • Hinderer K., Waldmann K-H. Approximate solution of Markov renewal programs with finite time horizon. SIAM J. Control Optim. (1998) 37(2):502–520CrossrefGoogle Scholar
  • Howard R. A. Semi-Markovian decision processes. Proc. Internat. Statist. Inst. (1963) Ottawa, CanadaGoogle Scholar
  • Hu Q., Liu J. An introduction to Markov decision processes (Chinese). (2000) (Xidian University, Xian, China) Google Scholar
  • Jewell W. S. Markov-renewal programming. I: Formulation finite return models. Oper. Res. (1963a) 11:938–948LinkGoogle Scholar
  • Jewell W. S. Markov-renewal programming. II: Infinite return models example. Oper. Res. (1963b) 11:949–971LinkGoogle Scholar
  • Kallenberg L. C. M. Linear programming and finite Markov control problems. (1983) (Math. Centre, Amsterdam, The Netherlands) Google Scholar
  • Lippman S. A. Maximal average-reward policies for semi-Markov decision processes with arbitrary state and action space. Ann. Math. Statist. (1971) 42(5):1717–1726CrossrefGoogle Scholar
  • Mann E., Janssen J. The functional equations of undiscounted denumerable state Markov renewal programming. Semi-Markov Model (1986) (Plenum, New York) 79–96CrossrefGoogle Scholar
  • Platzman L. Improved conditions for convergence in undiscounted Markov renewal programming. Oper. Res. (1977) 25:529–533LinkGoogle Scholar
  • Puterman M. L.Markov Decision Processes: Discrete Stochastic Dynam. Programming (1994) (John Wiley and Sons, New York) CrossrefGoogle Scholar
  • Ross S. M.Applied Probability Models with Optimization Applications (1970) (Holden Day, San Francisco) Google Scholar
  • Schal M. On the second optimality equation for semi-Markov decision models. Math. Oper. Res. (1992) 17(2):470–486LinkGoogle Scholar
  • Schweitzer P. J. Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. (1971) 34:495–501CrossrefGoogle Scholar
  • Schweitzer P. J. A value-iteration scheme for undiscounted multichain Markov renewal programs. Zeitschrift Oper. Res. (1984) 28:143–152CrossrefGoogle Scholar
  • Schweitzer P. J. Iterative bounds on the relative value vector in undiscounted Markov renewal programming. Zeitschrift Oper. Res. (1985) 29:269–284CrossrefGoogle Scholar
  • Schweitzer P. J., Federgruen A. The functional equations of undiscounted Markov renewal programming. Math. Oper. Res. (1978) 3(4):308–321LinkGoogle Scholar
  • Sennott L. I. Average cost semi-Markov decision processes and the control of queueing systems. Probab. Engrg. Inform. Sci. (1989) 3:247–272CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.