Deep Reinforcement Learning for Equilibrium Computation in Multistage Auctions and Contests

Fabian R. Pieroth
Corresponding Author
Fabian R. Pieroth
[email protected]
https://orcid.org/0000-0002-5712-1706
School of Computation, Information and Technology, Technical University of Munich
Search for more papers by this author
,
Nils Kohring
Nils Kohring
[email protected]
https://orcid.org/0000-0002-1952-167X
School of Computation, Information and Technology, Technical University of Munich
Search for more papers by this author
,
Martin Bichler
Martin Bichler
[email protected]
https://orcid.org/0000-0001-5491-2935
School of Computation, Information and Technology, Technical University of Munich
Search for more papers by this author

Fabian R. Pieroth

Corresponding Author

Fabian R. Pieroth

[email protected]

https://orcid.org/0000-0002-5712-1706

School of Computation, Information and Technology, Technical University of Munich

Search for more papers by this author

Nils Kohring

[email protected]

https://orcid.org/0000-0002-1952-167X

School of Computation, Information and Technology, Technical University of Munich

Search for more papers by this author

Martin Bichler

[email protected]

https://orcid.org/0000-0001-5491-2935

School of Computation, Information and Technology, Technical University of Munich

Search for more papers by this author

Published Online:5 Dec 2025https://doi.org/10.1287/mnsc.2024.06771

Abstract

We compute equilibrium strategies in multistage games with continuous signal and action spaces as they are widely used in the management sciences and economics. Examples include sequential sales via auctions, multistage elimination contests, and Stackelberg competitions. In sequential auctions, analysts performing equilibrium analysis are required to derive not just single bids but bid functions for all possible signals or values that a bidder might have in multiple stages. Because of the continuity of the signal and action spaces, these bid functions come from an infinite dimensional space. Although such models are fundamental to game theory and its applications, equilibrium strategies are rarely known. The resulting system of nonlinear differential equations is considered intractable for all but elementary models. This has been limiting progress in game theory and is a barrier to its adoption in the field. We show that deep reinforcement learning and self-play can learn equilibrium bidding strategies for various multistage games. Verifying an equilibrium in such games is challenging because of the continuous signal and action spaces. We introduce a verification algorithm and prove that the error of this verifier decreases when considering Lipschitz continuous strategies with increasing levels of discretization and sample sizes. Leveraging the novel verification algorithm, we find equilibrium in models that have not yet been explored analytically and new asymmetric equilibrium bid functions for established models of sequential auctions.

This paper was accepted by David Simchi-Levi, revenue management and market analytics.

Funding: This work was supported by the Deutsche Forschungsgemeinschaft [Grant BI 1057/9]. Additionally, this project has received funding from the European Research Council under the European Union’s Horizon Europe research and innovation programme [Grant Agreement 101198689].

Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2024.06771.

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:July 29, 2024
Accepted:April 09, 2025
Published Online:December 05, 2025

Cite as

Fabian R. Pieroth, Nils Kohring, Martin Bichler (2025) Deep Reinforcement Learning for Equilibrium Computation in Multistage Auctions and Contests. Management Science 0(0).

https://doi.org/10.1287/mnsc.2024.06771

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Deep Reinforcement Learning for Equilibrium Computation in Multistage Auctions and Contests

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News