Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

R. Srikant
R. Srikant
[email protected]
https://orcid.org/0000-0003-1483-5204
Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, Illinois 61801; and Coordinated Science Laboratory, University of Illinois Urbana-Champaign, Urbana, Illinois 61801
Search for more papers by this author

Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, Illinois 61801; and Coordinated Science Laboratory, University of Illinois Urbana-Champaign, Urbana, Illinois 61801

Search for more papers by this author

Published Online:3 Oct 2025https://doi.org/10.1287/moor.2024.0444

Abstract

We prove a nonasymptotic central limit theorem (CLT) for vector-valued martingale differences using Stein’s method, and we use Poisson’s equation to extend the result to functions of Markov chains. We then show that these results can be applied to establish a nonasymptotic CLT for temporal difference learning with averaging.

Funding: This work was supported by National Science Foundation [Grants CNS 23-12714, CCF 22-07547, and CNS 21-06801] and Air Force Office of Scientific Research [Grant FA9550-24-1-0002].

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Metrics

Information

Received:March 12, 2024
Accepted:June 01, 2025
Published Online:October 03, 2025

Cite as

R. Srikant (2025) Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2024.0444

Keywords

Acknowledgments

The author thanks Professor Siva Theja Maguluri, Georgia Tech, for many stimulating discussions and drawing his attention to several references during the course of this work; Professor Sean Meyn, University of Florida, for pointing out the bias issue with fixed step-size TD learning and why it cannot be eliminated by averaging and other discussions; Professor Mehrdad Moharrami, University of Iowa, for carefully reading an earlier version of the paper and providing many useful comments; and Weichen Wu, CMU, for pointing out a missing factor in Theorem 4 in an earlier version of the paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

Abstract

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News