Average Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Published Online:https://doi.org/10.1287/moor.1120.0555

References

  • Arapostathis A, Borkar VS, Fernandez-Gaucherand E, Ghosh MK, Marcus SI. Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Control Optim. (1993) 31(2):282–344CrossrefGoogle Scholar
  • Bather J. Optimal decision procedures for finite Markov chains. Part I: Examples. Adv. Appl. Probab. (1973) 5(2):328–339CrossrefGoogle Scholar
  • Berge E. Topological Spaces (1963) (Macmillan, New York) Google Scholar
  • Bertsekas DP, Shreve SE. Stochastic Optimal Control: The Discrete-Time Case (1996) (Athena Scientific, Belmont, MA) Google Scholar
  • Billingsley P. Convergence of Probability Measures (1968) (John Wiley & Sons, New York) Google Scholar
  • Blackwell D. Discrete dynamic programming. Ann. Math. Statist. (1962) 33(2):719–726CrossrefGoogle Scholar
  • Cavazos-Cadena R. A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Systems and Control Lett. (1991) 16(5):387–392CrossrefGoogle Scholar
  • Chen RC, Feinberg EA. Compactness of the space of non-randomized policies in countable-state sequential decision processes. Math. Methods Oper. Res. (2010) 71(2):307–323CrossrefGoogle Scholar
  • Chitashvili RY. A controlled finite Markov chain with an arbitrary set of decisions. Theor. Probab. Appl. (1975) 20(4):839–847CrossrefGoogle Scholar
  • Derman C. On sequential decisions and Markov chains. Management Sci. (1962) 9(1):16–24LinkGoogle Scholar
  • Dynkin EB, Yushkevich AA. Controlled Markov Processes (1979) (Springer-Verlag, New York) CrossrefGoogle Scholar
  • Feinberg EA. An ϵ-optimal control of a finite Markov chain. Theoret. Probab. Appl. (1980) 25(1):70–81CrossrefGoogle Scholar
  • Feinberg EA, Lewis ME. Optimality of four-threshold policies in inventory systems with customer returns and borrowing/storage options. Probab. Engrg. Inform. Sci. (2004) 19(1):45–71CrossrefGoogle Scholar
  • Feinberg EA, Lewis ME. Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Oper. Res. (2007) 32(4):769–783LinkGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zadoianchuk NV. Berge’s theorem for noncompact image sets. J. Math. Anal. Appl. (2012) . ForthcomingGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zadoianchuk NV. Fatou’s lemma for weakly converging probabilities. (2012) . "http://arXiv:1206.4073v1Google Scholar
  • Gubenko LG, Shtatland ES. On controlled, discrete-time Markov decision processes. Theory Probab. Math. Statist. (1975) 7:47–61Google Scholar
  • Hernández-Lerma O. Averege optimality in dynamic programming on Borel spaces—Unbounded costs and controls. Systems and Control Lett. (1991) 17(3):237–242CrossrefGoogle Scholar
  • Hernández-Lerma O, Lasserre JB. Discrete-Time Markov Control Processes: Basic Optimality Criteria (1996) (Springer-Verlag, New York) CrossrefGoogle Scholar
  • Hernández-Lerma O, Lasserre JB. Fatou’s lemma and Lebesgue’s convergence theorem for measures. J. Appl. Math. Stochastic Anal. (2000) 13(2):137–146CrossrefGoogle Scholar
  • Kechris AS. Classical Descriptive Set Theory (1995) (Springer-Verlag, New York) CrossrefGoogle Scholar
  • Luque-Vásquez F, Hernández-Lerma O. A counterexample on the semicontinuity of minima. Proc. Amer. Math. Soc. (1995) 123(10):3175–3176CrossrefGoogle Scholar
  • Ross SM. Non-discounted denumerable Markovian decision model. Ann. Math. Statist. (1968) 39(2):412–424CrossrefGoogle Scholar
  • Ross SM. Arbitrary state Markovian decision processes. Ann. Math. Statist. (1968) 39(6):2118–2122CrossrefGoogle Scholar
  • Ross SM. On the nonexistence of ϵ-optimal randomized stationary policies in average cost Markov decision models. Ann. Math. Statist. (1971) 42(5):1767–1768CrossrefGoogle Scholar
  • Schäl M. Average optimality in dynamic programming with general state space. Math. Oper. Res. (1993) 18(1):163–172LinkGoogle Scholar
  • Sennott LI. Stochastic Dynamic Programming and the Control of Queueing Systems (1999) (John Wiley & Sons, New York) Google Scholar
  • Sennott LI, Feinberg EA, Shwartz A. Average reward optimization theory for denumerable state spaces. Handbook of Markov Decision Processes (2002) (Kluwer, Boston) 153–172Methods and ApplicationsCrossrefGoogle Scholar
  • Serfozo R. Convergence of Lebesgue integrals with varying measures. Sankhya: The Indian Journal of Statistics (Series A) (1982) 44(3):380–402Google Scholar
  • Taylor HM. Markovian sequential replacement processes. Ann. Math. Statist. (1965) 36(6):1677–1694CrossrefGoogle Scholar
  • Viskov OV, Shiryaev AN. On controls which reduce to optimal stationary regimes. Trudy Mat. Inst. Steklov. (1964) 71:35–45[In Russian; English translation: Report Number FTD-HT-67-69, National Technical Information Service, U.S. Department of Commerce]Google Scholar
  • Zgurovsky MZ, Mel’nik VS, Kasyanov PO. Evolution Inclusions and Variation Inequalities for Earth Data Processing I (2011) (Springer, Berlin) CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.