Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Step Sizes

Published Online:https://doi.org/10.1287/moor.2024.0471

We consider linear stochastic approximation (LSA) with constant step size and Markovian data. Viewing the joint process of the data and LSA iterate as a time-homogeneous Markov chain, we prove its convergence to a unique limiting and stationary distribution and establish nonasymptotic, geometric convergence rates. Furthermore, we show that the bias vector of this limit admits an infinite series expansion with respect to the step size. Consequently, the bias is proportional to the step size up to higher-order terms. This result contrasts with LSA under independent and identically distributed data, for which the bias vanishes. In the reversible chain setting, we characterize the relationship between the bias and the mixing time of the Markovian data, establishing that they are roughly proportional to each other. Although Polyak–Ruppert tail averaging reduces the variance of LSA iterates, it does not affect the bias. The above characterization implies that the bias can be reduced using Richardson–Romberg extrapolation with m2 step sizes, eliminating the m1 leading terms in the bias expansion. This extrapolation scheme leads to an exponentially smaller bias and an improved mean-squared error both theoretically and empirically. Our results immediately apply to the temporal difference learning algorithm with linear function approximation and stochastic gradient descent applied to quadratic functions.

Funding: Y. Chen is supported in part by the National Science Foundation Division of Computing and Communication Foundations [Grants CCF-1704828 and CCF-2233152] and the University of Wisconsin–Madison [Vilas Associate Award]. Q. Xie is supported in part by the National Science Foundation Division of Computer and Network Systems [Grant CNS-1955997]; the Division of Electrical, Communications and Cyber Systems [Grants ECCS-2339794 and ECCS-2432546]; and JPMorgan Chase and Company [Faculty Research Award].

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.