Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

Published Online:https://doi.org/10.1287/moor.2024.0444

We prove a nonasymptotic central limit theorem (CLT) for vector-valued martingale differences using Stein’s method, and we use Poisson’s equation to extend the result to functions of Markov chains. We then show that these results can be applied to establish a nonasymptotic CLT for temporal difference learning with averaging.

Funding: This work was supported by National Science Foundation [Grants CNS 23-12714, CCF 22-07547, and CNS 21-06801] and Air Force Office of Scientific Research [Grant FA9550-24-1-0002].

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.