Episodic Bayesian Optimal Control with Unknown Randomness Distributions
Published Online:23 Jul 2025https://doi.org/10.1287/opre.2023.0446
References
- (2015) Bayesian optimal control of smoothly parameterized systems. UAI’15: Proc. 31st Conf. Uncertainty Artificial Intelligence (AUAI Press, Arlington, VA), 2–11.Google Scholar
- (2018) Improved regret bounds for Thompson sampling in linear quadratic control problems. Internat. Conf. Machine Learn. (PMLR, New York), 1–9.Google Scholar
- (2008) H∞-Optimal Control and Related Minimax Design Problems – A Dynamic Game Approach (Birkhäuser, Boston).Google Scholar
- (1978) Stochastic Optimal Control, the Discrete Time Case (Academic Press, New York).Google Scholar
- (2019) Adaptive robust control under model uncertainty. SIAM J. Control Optim. 57(2):925–946.Crossref, Google Scholar
- (2023) Statistical limit theorems in distributionally robust optimization. WSC’23: Proc. Winter Simulation Conf. (IEEE Press, Piscataway, NJ), 31–45.Google Scholar
- (1948) Application of the theory of martingales. Le Calcul Des Probabilites et Ses Applications (Centre National de la Recherche Scientifique, Paris), 23–27. [In French.]Google Scholar
- (2002) Optimal Learning: Computational Procedures for Bayes-Adaptive Markov Decision Processes (University of Massachusetts Amherst, Amherst, MA).Google Scholar
- (1983) Von Mises Calculus for Statistical Functionals, Lecture Notes in Statistics, vol. 19 (Springer-Verlag, New York).Crossref, Google Scholar
- (1989) Maxmin expected utility with non-unique prior. J. Math. Econom. 18(2):141–153.Crossref, Google Scholar
- (2002) Minimax control of discrete-time stochastic systems. SIAM J. Control Optim. 41(5):1626–1659.Crossref, Google Scholar
- (2006) Robust control and model misspecification. J. Econom. Theory 128(1):45–90.Crossref, Google Scholar
- (2015) Bayesian adaptive control. Stochastic Systems: Estimation, Identification, and Adaptive Control (Society for Industrial and Applied Mathematics, Philadelphia), 231–258.Crossref, Google Scholar
- (2024) Numerical methods for convex multistage stochastic optimization. Foundations Trends Optim. 6(2):63–144.Crossref, Google Scholar
- (2006) Model uncertainty, robust optimization and learning. Models, Methods, and Applications for Innovative Decision Making, Tutorials in Operations Research (INFORMS, Cantonsville, MD), 66–94.Link, Google Scholar
- (2022) Bayesian risk Markov decision processes. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. NIPS’22: Proc. 36th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 17430–17442.Google Scholar
- (2024) Bayesian stochastic gradient descent for stochastic optimization with streaming input data. SIAM J. Optim. 34(1):389–418.Crossref, Google Scholar
- (2014) Near-optimal reinforcement learning in factored MDPs. NIPS’14: Proc. 28th Internat. Conf. Neural Inform. Processing Systems, vol. 1 (MIT Press, Cambridge, MA), 604–612.Google Scholar
- (2017) Why is posterior sampling better than optimism for reinforcement learning? Internat. Conf. Machine Learning (PMLR), 2701–2710.Google Scholar
- (2013) (More) efficient reinforcement learning via posterior sampling. NIPS’13: Proc. 27th Internat. Conf. Neural Inform. Processing Systems, vol. 2 (Curran Associates, Red Hook, NY), 3003–3011.Google Scholar
- (1991) Multi-stage stochastic optimization applied to energy planning. Math. Programming 52(1–3):359–375.Crossref, Google Scholar
- (1975) Bayesian dynamic programming. Adv. Appl. Probab. 7(2):330–348.Crossref, Google Scholar
- (1965) On Bayes procedures. Z Wahrscheinlichkeitstheorie Verw. Gebiete 4:10–26.Crossref, Google Scholar
- (2012) Minimax and risk averse multistage stochastic programming. Eur. J. Oper. Res. 219(3):719–726.Crossref, Google Scholar
- (2021) Central limit theorem and sample complexity of stationary stochastic programs. Oper. Res. Lett. 49(5):676–681.Crossref, Google Scholar
- (2021) Lectures on Stochastic Programming: Modeling and Theory, 3rd ed. (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
- (2023) Bayesian distributionally robust optimization. SIAM J. Optim. 33(2):1279–1304.Crossref, Google Scholar
- (2014) A note on the strong formulation of stochastic control problems with model uncertainty. Electronic Comm. Probab. 19(81):1–10.Google Scholar
- (2000) A Bayesian framework for reinforcement learning. ICML’00: Proc. 17th Internat. Conf. Machine Learn. (Morgan Kaufmann Publishers Inc., San Francisco), 943–950.Google Scholar
- (2017) Posterior sampling for large scale reinforcement learning. Preprint, submitted November 21, https://arxiv.org/abs/1711.07979.Google Scholar
- (2019) Infinite horizon average cost dynamic programming subject to total variation distance ambiguity. SIAM J. Control Optim. 57(4):2843–2872.Crossref, Google Scholar
- (1998) Asymptotic Statistics (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2016) Distributionally robust control of constrained stochastic systems. IEEE Trans. Automatic Control 61(2):430–442.Google Scholar
- (2018) A Bayesian risk approach to data-driven stochastic optimization: Formulations and asymptotics. SIAM J. Optim. 28(2):1588–1612.Crossref, Google Scholar
- (2018) Wasserstein distributionally robust stochastic control: A data-driven approach. IEEE Trans. Automatic Control 66(8):3863–3870.Crossref, Google Scholar
- (2000) Foundations of Inventory Management (McGraw-Hill, New York).Google Scholar

