LEGO: Optimal Online Learning Under Sequential Price Competition

Shukai Li
Shukai Li
[email protected]
https://orcid.org/0000-0003-3540-5035
Operations and Business Analytics, New York University Shanghai, Shanghai 200124, China
Search for more papers by this author
,
Cong Shi
Corresponding Author
Cong Shi
[email protected]
https://orcid.org/0000-0003-3564-3391
Department Management, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146
Search for more papers by this author
,
Sanjay Mehrotra
Sanjay Mehrotra
[email protected]
https://orcid.org/0000-0003-1106-1901
Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208
Search for more papers by this author

Operations and Business Analytics, New York University Shanghai, Shanghai 200124, China

Search for more papers by this author

Cong Shi

Corresponding Author

Cong Shi

[email protected]

https://orcid.org/0000-0003-3564-3391

Department Management, Miami Herbert Business School, University of Miami, Coral Gables, Florida 33146

Search for more papers by this author

Sanjay Mehrotra

[email protected]

https://orcid.org/0000-0003-1106-1901

Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois 60208

Search for more papers by this author

Published Online:3 Jun 2026https://doi.org/10.1287/opre.2024.1085

Abstract

We consider price competition among multiple sellers over a selling horizon of T periods. In each period, sellers simultaneously set prices and subsequently observe their own demand realizations, which are unobservable to competitors. The realized demand of each seller depends on the prices of all sellers and follows a private, unknown linear model. We propose a least-squares estimation and then gradient optimization (LEGO) policy, which does not require sellers to share demand information or coordinate price experiments throughout the selling horizon. We show that, when employed by all sellers, our policy converges to the Nash equilibrium prices at a rate on the order of $1 / \sqrt{T}$ , matching the outcome under full information, whereas each seller achieves regret of optimal order $\sqrt{T}$ relative to a dynamic benchmark. Our analysis further shows that the unknown individual price sensitivity is the main source of difficulty in dynamic pricing under sequential competition, leading to worst-case regret of order $\sqrt{T}$ . If each seller knows that his or her individual price sensitivity coefficient, then a gradient-based policy achieves a convergence rate of order $1 / T$ to the Nash equilibrium as well as regret of optimal order $\log T$ .

Funding: S. Li and S. Mehrotra acknowledge support from the National Science Foundation [Grant CMMI-1763035]. C. Shi acknowledges support from an Amazon Research Award and a Provost Research Award from the University of Miami.

Supplemental Material: All supplemental materials, including the code, data, and files required to reproduce the results, are available at https://doi.org/10.1287/opre.2024.1085.

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:June 04, 2024
Accepted:April 30, 2026
Published Online:June 03, 2026

Cite as

Shukai Li, Cong Shi, Sanjay Mehrotra (2026) LEGO: Optimal Online Learning Under Sequential Price Competition. Operations Research 0(0).

https://doi.org/10.1287/opre.2024.1085

Keywords

Acknowledgments

The authors thank area editor Professor Xi Chen, the associate editor, and two anonymous referees for their careful reading and constructive comments, which led to several substantial improvements to the paper. Shukai Li and Sanjay Mehrotra acknowledge support from the National Science Foundation [Grant CMMI-1763035]. Cong Shi acknowledges support from an Amazon Research Award and a Provost Research Award from the University of Miami.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

LEGO: Optimal Online Learning Under Sequential Price Competition

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News