Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Eugene A. Feinberg
Corresponding Author
Eugene A. Feinberg
[email protected]
Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794
Search for more papers by this author
,
Pavlo O. Kasyanov
Pavlo O. Kasyanov
[email protected]
Institute for Applied System Analysis, National Technical University of Ukraine “Kyiv Polytechnic Institute,” 03056, Kyiv, Ukraine
Search for more papers by this author
,
Michael Z. Zgurovsky
Michael Z. Zgurovsky
[email protected]
National Technical University of Ukraine “Kyiv Polytechnic Institute,” 03056, Kyiv, Ukraine
Search for more papers by this author

Eugene A. Feinberg

Corresponding Author

Eugene A. Feinberg

[email protected]

Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794

Search for more papers by this author

Pavlo O. Kasyanov

[email protected]

Institute for Applied System Analysis, National Technical University of Ukraine “Kyiv Polytechnic Institute,” 03056, Kyiv, Ukraine

Search for more papers by this author

Michael Z. Zgurovsky

[email protected]

National Technical University of Ukraine “Kyiv Polytechnic Institute,” 03056, Kyiv, Ukraine

Search for more papers by this author

Published Online:22 Jan 2016https://doi.org/10.1287/moor.2015.0746

References

Aoki M (1965) Optimal control of partially observable Markovian systems. J. Franklin Inst. 280(5):367–386.Crossref, Google Scholar
Bäuerle N, Rieder U (2011) Markov Decision Processes with Applications to Finance (Springer, Berlin).Crossref, Google Scholar
Bensoussan A (1992) Stochastic Control of Partially Observable Systems (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Bensoussan A, Çakanyildirim M, Sethi S (2007) Partially observed inventory systems: The case of zero balance walk. SIAM J. Control Optim. 46(1):176–209.Crossref, Google Scholar
Bensoussan A, Çakanyildirim M, Sethi S (2011) Filtering for discrete-time Markov processes and applications to inventory control with incomplete information. Crisan D, Rozovskii B, eds. The Oxford Handbook of Nonlinear Filtering (Oxford University Press, New York), 500–525.Google Scholar
Bensoussan A, Çakanyildirim M, Sethi SP, Shi R (2010) An incomplete information inventory model with presence of inventories or backorders as only observations. J. Optimiz. Theory App. 146(3):544–580.Crossref, Google Scholar
Bensoussan A, Çakanyildirim M, Minjárez-Sosa JA, Sethi SP, Shi R (2008) Partially observed inventory systems: The case of rain checks. SIAM J. Control Optim. 47(5):2490–2519.Crossref, Google Scholar
Bertsekas DP, Shreve SE (1978) Stochastic Optimal Control: The Discrete-Time Case (Academic Press, New York).Google Scholar
Billingsley P (1968) Convergence of Probability Measures (Jonh Wiley & Sons, New York).Google Scholar
Bogachev VI (2007) Measure Theory, Vol. II (Springer, Berlin).Crossref, Google Scholar
Dynkin EB (1965) Controlled random sequences. Theory Probab. Appl. 10(1):1–14.Crossref, Google Scholar
Dynkin EB, Yushkevich AA (1979) Controlled Markov Processes (Springer, New York).Crossref, Google Scholar
Feinberg EA, Kasyanov PO, Voorneveld M (2014) Berge’s maximum theorem for noncompact image sets. J. Math. Anal. Appl. 413(2):1040–146.Crossref, Google Scholar
Feinberg EA, Kasyanov PO, Zadoianchuk NV (2012) Average-cost Markov decision processes with weakly continuous transition probabilities. Math. Oper. Res. 37(4):591–607.Link, Google Scholar
Feinberg EA, Kasyanov PO, Zadoianchuk NV (2013) Berge’s theorem for noncompact image sets. J. Math. Anal. Appl. 397(1):255–259.Crossref, Google Scholar
Feinberg EA, Kasyanov PO, Zadoianchuk NV (2014) Fatou’s lemma for weakly converging probabilities. Theory Probab. Appl. 58(4):683–689.Crossref, Google Scholar
Feinberg EA, Kasyanov PO, Zgurovsky MZ (2014) Convergence of probability measures and Markov decision models with incomplete information. Proc. Steklov Inst. Math. 287:96–117.Crossref, Google Scholar
Feinberg EA, Kasyanov PO, Zgurovsky MZ (2015) Uniform Fatou’s lemma. arXiv: 1504.01796.Google Scholar
Feinberg EA, Lewis ME (2007) Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Oper. Res. 32(4):769–783.Link, Google Scholar
Hernández-Lerma O (1989) Adaptive Markov Control Processes (Springer, New York).Crossref, Google Scholar
Hernández-Lerma O, Lassere JB (1996) Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York).Crossref, Google Scholar
Hernández-Lerma O, Romera R (2001) Limiting discounted-cost control of partially observable stochastic systems. SIAM J. Control Optim. 40(2): 348–369.Crossref, Google Scholar
Hinderer K (1970) Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter (Springer, Berlin).Crossref, Google Scholar
Jaśkiewicz A, Nowak AS (2006) Zero-sum ergodic stochastic games with Feller transition probabilities. SIAM J. Control Optim. 45(3):773–789.Crossref, Google Scholar
Parthasarathy KR (1967) Probability Measures on Metric Spaces (Academic Press, New York).Crossref, Google Scholar
Rhenius D (1974) Incomplete information in Markovian decision models. Ann. Statist. 2(6):1327–1334.Crossref, Google Scholar
Rieder U (1975) Bayesian dynamic programming. Adv. Appl. Probab. 7(2):330–348.Crossref, Google Scholar
Royden HL (1968) Real Analysis, 2nd ed. (Macmillan, New York).Google Scholar
Sawaragi Y, Yoshikawa T (1970) Descrete-time Markovian decision processes with incomplete state observations. Ann. Math. Statist. 41(1):78–86.Crossref, Google Scholar
Schäl M (1993) Average optimality in dynamic programming with general state space. Math. Oper. Res. 18(1):163–172.Link, Google Scholar
Shiryaev AN (1969) Some new results in the theory of controlled random processes. Select. Translations Math. Statist. Probab. 8:49–130.Google Scholar
Shiryaev AN (1995) Probability (Springer, New York).Google Scholar
Sondik EJ (1978) The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs. Oper. Res. 26(2):282–304.Link, Google Scholar
Striebel C (1975) Optimal Control for Discrete Time Stochastic Systems (Springer, Berlin).Crossref, Google Scholar
Yüksel S, Linder T (2012) Optimization and convergence of observation channels in stochastic control. SIAM J. Control Optim. 50(2):864–887.Crossref, Google Scholar
Yushkevich AA (1976) Reduction of a controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control spaces. Theory Probab. Appl. 21(1):153–158.Crossref, Google Scholar

cover image Mathematics of Operations Research

Volume 41, Issue 2

May 2016

Pages 377-744

Article Information

Metrics

Information

Received:January 10, 2014
Published Online:January 22, 2016

Cite as

Eugene A. Feinberg, Pavlo O. Kasyanov, Michael Z. Zgurovsky (2016) Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities. Mathematics of Operations Research 41(2):656-681.

https://doi.org/10.1287/moor.2015.0746

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

References

Volume 41, Issue 2

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News