Long-Term Values in Markov Decision Processes and Repeated Games, and a New Distance for Probability Spaces
Published Online:28 Nov 2016https://doi.org/10.1287/moor.2016.0814
References
- (1993) Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Control Optim. 31(2):282–344.Crossref, Google Scholar
- (1972) Real Analysis and Probability (Academic Press, New York).Google Scholar
- (1965) Optimal control of Markov processes with incomplete state information. J. Math. Anal. Appl. 10(1):174–205.Crossref, Google Scholar
- (1977) Applied Abstract Analysis (John Wiley & Sons, New York).Google Scholar
- (1995) Repeated Games with Incomplete Information (MIT Press, Cambridge, MA).Google Scholar
- (1957) A Markovian decision process. Technical Report P-1066, RAND Corporation, Santa Monica, CA.Google Scholar
- (1976) The asymptotic theory of stochastic games. Math. Oper. Res. 1(3):197–208.Link, Google Scholar
- (1931) Proof of the ergodic theorem. Proc. Natl. Acad. Sci. USA 17(12):656–660.Crossref, Google Scholar
- (1962) Discrete dynamic programming. Ann. Math. Statist. 33(2):719–726.Crossref, Google Scholar
- (2000) Average cost dynamic programming equations for controlled Markov chains with partial observations. SIAM J. Control Optim. 39(3):673–681.Crossref, Google Scholar
- (2007) Dynamic programming for ergodic control of Markov chains under partial observations: A correction. SIAM J. Control Optim. 45(6):2299–2304.Crossref, Google Scholar
- (2014) Asymmetric warfare. Preprint arXiv:1403.1385.Google Scholar
- (2014) Existence of asymptotic values for nonexpansive stochastic control systems. Appl. Math. Optim. 70(1):1–28.Crossref, Google Scholar
- (1956) Existence et unicité des représentations intégrales au moyen des points extrémaux dans les cônes convexes. Séminaire Bourbaki 4:33–47.Google Scholar
- (1968) Multichain Markov renewal programs. SIAM J. Appl. Math. 16(3):468–487.Crossref, Google Scholar
- (1965) How to Gamble If You Must: Inequalities for Stochastic Processes (McGraw-Hill, New York).Google Scholar
- (2002) Real Analysis and Probability, Cambridge Studies in Advanced Mathematics, Vol. 74 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (1967) Games with incomplete information played by “Bayesian” players, I–III: Part I. The basic model. Management Sci. 14(3):159–182.Link, Google Scholar
- (1979) Linear programming and Markov decision chains. Management Sci. 25(4):352–362.Link, Google Scholar
- (2010) On a Markov game with one-sided information. Oper. Res. 58(4):1107–1115.Link, Google Scholar
- (1992) A uniform Tauberian theorem in dynamic programming. Math. Oper. Res. 17(2):303–307.Link, Google Scholar
- (1996) Discrete Gambling and Stochastic Games (Springer, Berlin).Crossref, Google Scholar
- (1934) Extension of range of functions. Bull. Amer. Math. Soc. 40(12):837–842.Crossref, Google Scholar
- (1987) Repeated games. Proc. Internat. Congress Mathematicians, Berkeley, California, USA, 1986 (American Mathematical Society, Providence, RI), 1528–1577.Crossref, Google Scholar
- (1981) Stochastic games. Internat. J. Game Theory 10(2):53–66.Crossref, Google Scholar
- (1985) Formulation of Bayesian analysis for games with incomplete information. Internat. J. Game Theory 14(1):1–29.Crossref, Google Scholar
- (2005) Repeated games (Cambridge University Press, Cambridge, UK).Google Scholar
- (2008) Existence of optimal strategies in Markov games with incomplete information. Internat. J. Game Theory 37(4):581–596.Crossref, Google Scholar
- (2011) On the existence of a limit value in some nonexpansive optimal control problems. SIAM J. Control Optim. 49(5):2118–2132.Crossref, Google Scholar
- (2006) The value of Markov chain games with lack of information on one side. Math. Oper. Res. 31(3):490–512.Link, Google Scholar
- (2011) Uniform value in dynamic programming. J. Eur. Math. Soc. 13(2):309–330.Crossref, Google Scholar
- (2014) General limit value in dynamic programming. J. Dynam. Games 1(3):471–484.Crossref, Google Scholar
- (2012) The value of repeated games with an informed controller. Math. Oper. Res. 37(1):154–179.Link, Google Scholar
- (1974) Incomplete information in Markovian decision models. Ann. Statist. 2(6):1327–1334.Crossref, Google Scholar
- (2002) Blackwell optimality in Markov decision processes with partial observation. Ann. Statist. 30(4):1178–1193.Crossref, Google Scholar
- (2004) Stochastic games with a single controller and incomplete information. SIAM J. Control Optim. 43(1):86–110.Crossref, Google Scholar
- (1991) On the construction of nearly optimal strategies for a general problem of control of partially observed diffusions. Stochastics 37(1–2):15–47.Google Scholar
- (2012) Personal communication, September 2012.Google Scholar
- (1970) Discrete-time Markovian decision processes with incomplete state observation. Ann. Math. Statist. 41(1):78–86.Crossref, Google Scholar
- (1953) Stochastic games. Proc. Natl. Acad. Sci. USA 39(10):1095–1100.Crossref, Google Scholar
- (2002) A First Course on Zero-Sum Repeated Games, Mathematiques and Applications, Vol. 37 (Springer, New York).Google Scholar
- (2003) Topics in Optimal Transportation, Graduate Studies in Mathematics, Vol. 58 (American Mathematical Society, Providence, RI).Crossref, Google Scholar
- (1932) Proof of the quasi-ergodic hypothesis. Proc. Natl. Acad. Sci. USA 18(1):70–82.Crossref, Google Scholar

