Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Published Online:https://doi.org/10.1287/moor.2015.0746

References

  • Aoki M (1965) Optimal control of partially observable Markovian systems. J. Franklin Inst. 280(5):367–386.CrossrefGoogle Scholar
  • Bäuerle N, Rieder U (2011) Markov Decision Processes with Applications to Finance (Springer, Berlin).CrossrefGoogle Scholar
  • Bensoussan A (1992) Stochastic Control of Partially Observable Systems (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Bensoussan A, Çakanyildirim M, Sethi S (2007) Partially observed inventory systems: The case of zero balance walk. SIAM J. Control Optim. 46(1):176–209.CrossrefGoogle Scholar
  • Bensoussan A, Çakanyildirim M, Sethi S (2011) Filtering for discrete-time Markov processes and applications to inventory control with incomplete information. Crisan D, Rozovskii B, eds. The Oxford Handbook of Nonlinear Filtering (Oxford University Press, New York), 500–525.Google Scholar
  • Bensoussan A, Çakanyildirim M, Sethi SP, Shi R (2010) An incomplete information inventory model with presence of inventories or backorders as only observations. J. Optimiz. Theory App. 146(3):544–580.CrossrefGoogle Scholar
  • Bensoussan A, Çakanyildirim M, Minjárez-Sosa JA, Sethi SP, Shi R (2008) Partially observed inventory systems: The case of rain checks. SIAM J. Control Optim. 47(5):2490–2519.CrossrefGoogle Scholar
  • Bertsekas DP, Shreve SE (1978) Stochastic Optimal Control: The Discrete-Time Case (Academic Press, New York).Google Scholar
  • Billingsley P (1968) Convergence of Probability Measures (Jonh Wiley & Sons, New York).Google Scholar
  • Bogachev VI (2007) Measure Theory, Vol. II (Springer, Berlin).CrossrefGoogle Scholar
  • Dynkin EB (1965) Controlled random sequences. Theory Probab. Appl. 10(1):1–14.CrossrefGoogle Scholar
  • Dynkin EB, Yushkevich AA (1979) Controlled Markov Processes (Springer, New York).CrossrefGoogle Scholar
  • Feinberg EA, Kasyanov PO, Voorneveld M (2014) Berge’s maximum theorem for noncompact image sets. J. Math. Anal. Appl. 413(2):1040–146.CrossrefGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zadoianchuk NV (2012) Average-cost Markov decision processes with weakly continuous transition probabilities. Math. Oper. Res. 37(4):591–607.LinkGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zadoianchuk NV (2013) Berge’s theorem for noncompact image sets. J. Math. Anal. Appl. 397(1):255–259.CrossrefGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zadoianchuk NV (2014) Fatou’s lemma for weakly converging probabilities. Theory Probab. Appl. 58(4):683–689.CrossrefGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zgurovsky MZ (2014) Convergence of probability measures and Markov decision models with incomplete information. Proc. Steklov Inst. Math. 287:96–117.CrossrefGoogle Scholar
  • Feinberg EA, Kasyanov PO, Zgurovsky MZ (2015) Uniform Fatou’s lemma. arXiv: 1504.01796.Google Scholar
  • Feinberg EA, Lewis ME (2007) Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Oper. Res. 32(4):769–783.LinkGoogle Scholar
  • Hernández-Lerma O (1989) Adaptive Markov Control Processes (Springer, New York).CrossrefGoogle Scholar
  • Hernández-Lerma O, Lassere JB (1996) Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York).CrossrefGoogle Scholar
  • Hernández-Lerma O, Romera R (2001) Limiting discounted-cost control of partially observable stochastic systems. SIAM J. Control Optim. 40(2): 348–369.CrossrefGoogle Scholar
  • Hinderer K (1970) Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter (Springer, Berlin).CrossrefGoogle Scholar
  • Jaśkiewicz A, Nowak AS (2006) Zero-sum ergodic stochastic games with Feller transition probabilities. SIAM J. Control Optim. 45(3):773–789.CrossrefGoogle Scholar
  • Parthasarathy KR (1967) Probability Measures on Metric Spaces (Academic Press, New York).CrossrefGoogle Scholar
  • Rhenius D (1974) Incomplete information in Markovian decision models. Ann. Statist. 2(6):1327–1334.CrossrefGoogle Scholar
  • Rieder U (1975) Bayesian dynamic programming. Adv. Appl. Probab. 7(2):330–348.CrossrefGoogle Scholar
  • Royden HL (1968) Real Analysis, 2nd ed. (Macmillan, New York).Google Scholar
  • Sawaragi Y, Yoshikawa T (1970) Descrete-time Markovian decision processes with incomplete state observations. Ann. Math. Statist. 41(1):78–86.CrossrefGoogle Scholar
  • Schäl M (1993) Average optimality in dynamic programming with general state space. Math. Oper. Res. 18(1):163–172.LinkGoogle Scholar
  • Shiryaev AN (1969) Some new results in the theory of controlled random processes. Select. Translations Math. Statist. Probab. 8:49–130.Google Scholar
  • Shiryaev AN (1995) Probability (Springer, New York).Google Scholar
  • Sondik EJ (1978) The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. 26(2):282–304.LinkGoogle Scholar
  • Striebel C (1975) Optimal Control for Discrete Time Stochastic Systems (Springer, Berlin).CrossrefGoogle Scholar
  • Yüksel S, Linder T (2012) Optimization and convergence of observation channels in stochastic control. SIAM J. Control Optim. 50(2):864–887.CrossrefGoogle Scholar
  • Yushkevich AA (1976) Reduction of a controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control spaces. Theory Probab. Appl. 21(1):153–158.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.