On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies
References
- [1] (1993) Discrete-time controlled Markov processes with average cost criterion: A survey. SIAM J. Control Optim. 31(2):282–344.Crossref, Google Scholar
- [2] (1972) Real Analysis and Probability (Academic Press, New York).Google Scholar
- [3] (1989) On compactness of the space of policies in stochastic dynamic programming. Stochastic Proc. Appl. 32:141–150.Crossref, Google Scholar
- [4] (2014) More risk-sensitive Markov decision processes. Math. Oper. Res. 39:105–120.Link, Google Scholar
- [5] (1978) Stochastic Optimal Control: The Discrete Time Case (Academic Press, New York).Google Scholar
- [6] (1987) An expected average reward criterion. Stochastic Proc. Appl. 26:123–140.Crossref, Google Scholar
- [7] (1968) A Borel set not containing a graph. Ann. Math. Statist. 39:1345–1347.Crossref, Google Scholar
- [8] (1976) The stochastic processes of Borel gambling and dynamic programming. Ann. Statist. 4:370–374.Crossref, Google Scholar
- [9] (1963) Non-existence of everywhere proper conditional distributions. Ann. Math. Statist. 34:223–225.Crossref, Google Scholar
- [10] (1974) The optimal reward operator in dynamic programming. Ann. Probability 2(5):926–941.Crossref, Google Scholar
- [11] (2002) Convex analytic methods in Markov decision processes. Feinberg EA, Shwartz A, eds. Handbook of Markov Decision Processes: Methods and Applications (Springer Science+Business Media, New York), 347–375.Crossref, Google Scholar
- [12] (1973) Measurable selections of extrema. Ann. Statist. 1:902–912.Crossref, Google Scholar
- [13] (2010) The discounted method and equivalence of average criteria for risk-sensitive Markov decision processes on Borel spaces. Appl. Math. Optim. 61:167–190.Crossref, Google Scholar
- [14] (1964) Measurable sets of measures. Pacific J. Math. 14:1211–1222.Crossref, Google Scholar
- [15] (2002) Real Analysis and Probability (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- [16] (1979) Controlled Markov Processes (Springer, New York).Crossref, Google Scholar
- [17] (1980) An ϵ-optimality control of a finite Markov chain with an average reward criterion. Theory Probability Appl. 25(1):70–81.Crossref, Google Scholar
- [18] (1982a) Controlled Markov processes with arbitrary numerical criteria. Theory Probability Appl. 27(3):486–503.Crossref, Google Scholar
- [19] (1982b) Non-randomized Markov and semi-Markov strategies in dynamic programming. Theory Probability Appl. 27(1):116–126.Crossref, Google Scholar
- [20] (1991) Non-randomized strategies in stochastic decision processes. Ann. Oper. Res. 29:315–332.Crossref, Google Scholar
- [21] (1996) On measurability and representation of strategic measures in Markov decision processes. Statistics, Probability and Game Theory, Lecture Notes–Monograph Series (IMS), vol. 30, 29–43.Google Scholar
- [22] (2021) MDPs with setwise continuous transition probabilities. Oper. Res. Lett. 49:734–740.Crossref, Google Scholar
- [23] (2020) Fatou’s lemma in its classic form and Lebesgue’s convergence theorems for varying measures with applications to MDPs. Theory Probability Appl. 65:270–291.Crossref, Google Scholar
- [24] (1979) Controlled Stochastic Processes (Springer, New York).Crossref, Google Scholar
- [25] (1938) The consistency of the axiom of choice and of the generalized continuum hypothesis. Proc. National Acad. Sci. USA 24:556–557.Crossref, Google Scholar
- [26] (2002) Minimax control of discrete-time stochastic systems. SIAM J. Control Optim. 41(5):1626–1659.Crossref, Google Scholar
- [27] (1996) Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York).Crossref, Google Scholar
- [28] (1999) Further Topics on Discrete-Time Markov Control Processes (Springer, New York).Crossref, Google Scholar
- [29] (2002) The linear programming approach. Feinberg EA, Shwartz A, eds. Handbook of Markov Decision Processes: Methods and Applications (Springer Science+Business Media, New York).Crossref, Google Scholar
- [30] (2007) Average optimality for risk-sensitive control with general state space. Ann. Appl. Probability 17:654–675.Crossref, Google Scholar
- [31] (2014) Robust Markov control processes. J. Math. Anal. Appl. 420:1337–1353.Crossref, Google Scholar
- [32] (2018) Zero-sum stochastic games. Başar T, Zaccour G, eds. Handbook of Dynamic Game Theory (Springer International Publishing, Cham, Switzerland), 215–279.Crossref, Google Scholar
- [33] (1995) Classical Descriptive Set Theory (Springer-Verlag, New York).Crossref, Google Scholar
- [34] (2014) Large cardinals and determinacy. Zalta EN, ed. The Stanford Encyclopedia of Philosophy (Metaphysics Research Laboratory, Stanford University, Standford, CA). https://plato.stanford.edu/archives/spr2014/entries/large-cardinals-determinacy/.Google Scholar
- [35] (1985) Ergodic Theorems (Walter de Gruyter, Berlin).Crossref, Google Scholar
- [36] (1993) Borel stochastic games with limsup payoffs. Ann. Probability 21:861–885.Crossref, Google Scholar
- [37] (1998) Finitely additive stochastic games with Borel measurable payoffs. Internat. J. Game Theory 27:257–267.Crossref, Google Scholar
- [38] (1990) Leavable gambling problems with unbounded utilities. Trans. Amer. Math. Soc. 320:543–567.Crossref, Google Scholar
- [39] (1988) Projective determinacy. Proc. National Acad. Sci. USA 85:6582–6586.Crossref, Google Scholar
- [40] (1985) Universally measurable strategies in zero-sum stochastic games. Ann. Probability 13(1):269–287.Crossref, Google Scholar
- [41] (2010) On measurable minimax selectors. J. Math. Anal. Appl. 366:385–388.Crossref, Google Scholar
- [42] (1967) Probability Measures on Metric Spaces (Academic Press, New York).Crossref, Google Scholar
- [43] (2016) Measurability of the value of a parametrized game. Internat. J. Game Theory 45:675–683.Crossref, Google Scholar
- [44] (1991) Non-cooperative dynamic games with general utility functions. Raghavan TES, Ferguson TS, Parthasarathy T, Vrieze OJ, eds. Stochastic Games and Related Topics (Kluwer, Dordrecht, Netherlands), 161–174.Crossref, Google Scholar
- [45] (1975) On dynamic programming: Compactness of the space of policies. Stoch. Proc. Appl. 3:345–364.Crossref, Google Scholar
- [46] (2021) Lectures on Stochastic Programming: Modeling and Theory, 3rd ed. (Society for Industrial and Applied Mathematics and Mathematical Optimization Society, Philadelphia).Crossref, Google Scholar
- [47] (1978) Alternative theoretical frameworks for finite horizon discrete-time stochastic optimal control. SIAM J. Control Optim. 16(6):953–978.Crossref, Google Scholar
- [48] (1979) Universally measurable policies in dynamic programming. Math. Oper. Res. 4(1):15–30.Link, Google Scholar
- [49] (1966) Negative dynamic programming. Ann. Math. Statist. 37:871–890.Crossref, Google Scholar
- [50] (1969) On the existence of good stationary strategies. Trans. Amer. Math. Soc. 135:399–414.Crossref, Google Scholar
- [51] (2018) Solutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: The fixed-point approach revisited. J. Math. Anal. Appl. 464:152–163.Crossref, Google Scholar
- [52] (2022) CVaR-based safety analysis in the infinite time horizon setting. Proc. Amer. Control Conf. (IEEE, Piscataway, NJ), 2863–2870.Google Scholar
- [53] (2015) On convergence of value iteration for a class of total cost Markov decision processes. SIAM J. Control Optim. 53(4):1982–2016.Crossref, Google Scholar
- [54] (2020) Average cost optimality inequality for Markov decision processes with Borel spaces and universally measurable policies. SIAM J. Control Optim. 58(4):2469–2502.Crossref, Google Scholar
- [55] (2022) On structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policies. J. Math. Anal. Appl. 509:125954.Crossref, Google Scholar
- [56] (2023) On strategic measures and optimality properties in discrete-time stochastic control with universally measurable policies. Preprint, https://arxiv.org/abs/ 2206.06492.Google Scholar
- [57] (2020) A universal dynamic program and refined existence results for decentralized stochastic control. SIAM J. Control Optim. 58(5):2711–2739.Crossref, Google Scholar
- [58] (2017) Convex analysis in decentralized stochastic control, strategic measures, and optimal solutions. SIAM J. Control Optim. 55(1):1–28.Crossref, Google Scholar

