Entropy Regularization for Mean Field Games with Learning

Published Online:https://doi.org/10.1287/moor.2021.1238

Entropy regularization has been extensively adopted to improve the efficiency, the stability, and the convergence of algorithms in reinforcement learning. This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for mean field games (MFGs) with learning in a finite time horizon. Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. In addition, this study leads to a policy-gradient algorithm with exploration in MFG. With this algorithm, agents are able to learn the optimal exploration scheduling, with stable and fast convergence to the game equilibrium.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.