Deep Reinforcement Learning for Equilibrium Computation in Multistage Auctions and Contests

Published Online:https://doi.org/10.1287/mnsc.2024.06771

We compute equilibrium strategies in multistage games with continuous signal and action spaces as they are widely used in the management sciences and economics. Examples include sequential sales via auctions, multistage elimination contests, and Stackelberg competitions. In sequential auctions, analysts performing equilibrium analysis are required to derive not just single bids but bid functions for all possible signals or values that a bidder might have in multiple stages. Because of the continuity of the signal and action spaces, these bid functions come from an infinite dimensional space. Although such models are fundamental to game theory and its applications, equilibrium strategies are rarely known. The resulting system of nonlinear differential equations is considered intractable for all but elementary models. This has been limiting progress in game theory and is a barrier to its adoption in the field. We show that deep reinforcement learning and self-play can learn equilibrium bidding strategies for various multistage games. Verifying an equilibrium in such games is challenging because of the continuous signal and action spaces. We introduce a verification algorithm and prove that the error of this verifier decreases when considering Lipschitz continuous strategies with increasing levels of discretization and sample sizes. Leveraging the novel verification algorithm, we find equilibrium in models that have not yet been explored analytically and new asymmetric equilibrium bid functions for established models of sequential auctions.

This paper was accepted by David Simchi-Levi, revenue management and market analytics.

Funding: This work was supported by the Deutsche Forschungsgemeinschaft [Grant BI 1057/9]. Additionally, this project has received funding from the European Research Council under the European Union’s Horizon Europe research and innovation programme [Grant Agreement 101198689].

Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2024.06771.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.