Open Problem—Convergence and Asymptotic Optimality of the Relative Value Iteration in Ergodic Control

Published Online:https://doi.org/10.1287/stsy.2019.0040

References

  • Arapostathis A, Borkar VS (2019) Average cost optimal control under weak hypotheses: Relative value iterations. Preprint, submitted February 4, arXiv:1902.01048.Google Scholar
  • Arapostathis A, Biswas A, Carroll J (2017) On solutions of mean field games with ergodic cost. J. Math. Pures Appl. (9) 107(2):205–251.Google Scholar
  • Arapostathis A, Borkar VS, Kumar KS (2013) Relative value iteration for stochastic differential games. Křivan V, Zaccour G, eds. Advances in Dynamic Games, Annals of the International Society of Dynamic Games, vol. 13 (Birkhäuser/Springer, Cham), 3–27.Google Scholar
  • Arapostathis A, Borkar VS, Kumar KS (2014) Convergence of the relative value iteration for the ergodic control problem of nondegenerate diffusions under near-monotone costs. SIAM J. Control Optim. 52(1):1–31.Google Scholar
  • Cavazos-Cadena R (1996) Value iteration in a class of communicating Markov decision chains with the average cost criterion. SIAM J. Control Optim. 34(6):1848–1873.Google Scholar
  • Cavazos-Cadena R (1998) A note on the convergence rate of the value iteration scheme in controlled Markov chains. Systems Control Lett. 33(4):221–230.Google Scholar
  • Chen RR, Meyn S (1999) Value iteration and optimization of multiclass queueing networks. Queueing Systems Theory Appl. 32(1-3):65–97.Google Scholar
  • Della Vecchia E, Di Marco S, Jean-Marie A (2012) Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs. Ann. Oper. Res. 199:193–214.Google Scholar
  • Douc R, Fort G, Guillin A (2009) Subgeometric rates of convergence of f-ergodic strong Markov processes. Stochastic Process. Appl. 119(3):897–923.Google Scholar
  • Hairer M (2016) Convergence of Markov processes. Lecture notes, University of Warwick. Accessed September 1, 2019, http://www.hairer.org/notes/Convergence.pdf.Google Scholar
  • Hernández-Lerma O, Lasserre JB (1990) Error bounds for rolling horizon policies in discrete-time Markov control processes. IEEE Trans. Automat. Control 35(10):1118–1124.Google Scholar
  • Ichihara N (2012) Large time asymptotic problems for optimal stochastic control with superlinear cost. Stochastic Process. Appl. 122(4):1248–1275.Google Scholar
  • Montes-de Oca R, Hernández-Lerma O (1996) Value iteration in average cost Markov control processes on Borel spaces. Acta Appl. Math. 42(2):203–222.Google Scholar
  • White DJ (1963) Dynamic programming, Markov chains, and the method of successive approximations. J. Math. Anal. Appl. 6:373–376.Google Scholar
  • Yu H (2015) On convergence of value iteration for a class of total cost Markov decision processes. SIAM J. Control Optim. 53(4):1982–2016.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.