On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies

Huizhen Yu
Huizhen Yu
[email protected]
https://orcid.org/0000-0002-3673-0094
Department of Computing Science, University of Alberta, Edmonton, Alberta T6G2E8, Canada
Search for more papers by this author

Department of Computing Science, University of Alberta, Edmonton, Alberta T6G2E8, Canada

Published Online:24 Oct 2023https://doi.org/10.1287/moor.2022.0188

Abstract

This paper concerns discrete-time infinite-horizon stochastic control systems with Borel state and action spaces and universally measurable policies. We study optimization problems on strategic measures induced by the policies in these systems. The results are then applied to risk-neutral and risk-sensitive Markov decision processes to establish the measurability of the optimal value functions and the existence of universally measurable, randomized or nonrandomized, ϵ-optimal policies, for a variety of average cost criteria and risk criteria. We also extend our analysis to a class of minimax control problems and establish similar optimality results under the axiom of analytic determinacy.

Funding: This work was supported by grants from DeepMind, the Alberta Machine Intelligence Institute (AMII), and Alberta Innovates-Technology Futures (AITF).

cover image Mathematics of Operations Research

Volume 49, Issue 3

August 2024

Pages 1303-2047, C2

Article Information

Metrics

Information

Received:July 17, 2022
Accepted:July 09, 2023
Published Online:October 24, 2023

Cite as

Huizhen Yu (2023) On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies. Mathematics of Operations Research 49(3):1734-1760.

https://doi.org/10.1287/moor.2022.0188

Keywords

Acknowledgments

The author is grateful to Professor Eugene Feinberg for pointing to several important references on strategic measures and stochastic games and valuable feedback on earlier versions of this work. The author also thanks Professor William Sudderth for mentioning the early work (Maitra et al. [38]) on Borel gambling problems, which also used Kondô’s uniformization theorem in its analysis; Professor Serdar Yüksel, for helpful discussion on minimax control problems; and an anonymous reviewer, whose critical comments helped improve the paper. This paper is dedicated to the memory of Professor Sanjoy K. Mitter, an inspiring mentor.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

On Strategic Measures and Optimality Properties in Discrete-Time Stochastic Control with Universally Measurable Policies

Abstract

Volume 49, Issue 3

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News