Last-Iterate Convergence in No-Regret Learning: Games with Reference Effects Under Logit Demand

Mengzi Amy Guo
Mengzi Amy Guo
[email protected]
https://orcid.org/0000-0001-8869-4673
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720
Search for more papers by this author
,
Donghao Ying
Donghao Ying
[email protected]
https://orcid.org/0009-0001-7329-5917
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720
Search for more papers by this author
,
Javad Lavaei
Javad Lavaei
[email protected]
https://orcid.org/0000-0003-4294-1338
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720
Search for more papers by this author
,
Zuo-Jun Max Shen
Corresponding Author
Zuo-Jun Max Shen
[email protected]
https://orcid.org/0000-0003-4538-8312
Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720; and Faculty of Engineering and Faculty of Business and Economics, University of Hong Kong, Hong Kong, China
Search for more papers by this author

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720

Search for more papers by this author

Donghao Ying

[email protected]

https://orcid.org/0009-0001-7329-5917

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720

Search for more papers by this author

Javad Lavaei

[email protected]

https://orcid.org/0000-0003-4294-1338

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720

Search for more papers by this author

Zuo-Jun Max Shen

Corresponding Author

Zuo-Jun Max Shen

[email protected]

https://orcid.org/0000-0003-4538-8312

Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley, California 94720; and Faculty of Engineering and Faculty of Business and Economics, University of Hong Kong, Hong Kong, China

Search for more papers by this author

Published Online:21 May 2025https://doi.org/10.1287/mnsc.2023.03464

References

Agrawal S, Tang W (2024) Dynamic pricing and learning with long-term reference effects. Preprint, submitted February 19, https://arxiv.org/abs/2402.12562.Google Scholar
Arrowsmith DK, Place CM (1990) An Introduction to Dynamical Systems (Cambridge University Press, Cambridge, UK).Google Scholar
Ba W, Lin T, Zhang J, Zhou Z (2021) Doubly optimal no-regret online learning in strongly monotone games with bandit feedback. Preprint, submitted December 6, https://arxiv.org/abs/2112.02856.Google Scholar
Boyd S, Xiao L, Mutapcic A (2003) Subgradient methods. Lecture notes of EE392o, Stanford University, Autumn Quarter 2004:2004–2005, Stanford University, Stanford, CA.Google Scholar
Bravo M, Leslie D, Mertikopoulos P (2018) Bandit learning in concave N-person games. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 5666–5676.Google Scholar
Briesch RA, Krishnamurthi L, Mazumdar T, Raj SP (1997) A comparative analysis of reference price models. J. Consumer Res. 24(2):202–214.Crossref, Google Scholar
Chen N, Nasiry J (2020) Does loss aversion preclude price variation? Manufacturing Service Oper. Management 22(2):383–395.Link, Google Scholar
Chen X, Hu P, Hu Z (2017) Efficient algorithms for the dynamic pricing problem with reference price effect. Management Sci. 63(12):4389–4408.Link, Google Scholar
Colombo L, Labrecciosa P (2021) Dynamic oligopoly pricing with reference-price effects. Eur. J. Oper. Res. 288(3):1006–1016.Crossref, Google Scholar
den Boer AV, Keskin NB (2022) Dynamic pricing with demand learning and reference effects. Management Sci. 68(10):7112–7130.Link, Google Scholar
Federgruen A, Lu L (2016) Price competition based on relative prices. Columbia Business School Research Paper No. 13-9, Columbia University, New York.Google Scholar
Golrezaei N, Jaillet P, Liang JCN (2020) No-regret learning in price competitions under consumer reference effects. Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, eds. Advances in Neural Information Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 20766–20778.Google Scholar
Goyal V, Li S, Mehrotra S (2023) Learning to price under competition for multinomial logit demand. Preprint, submitted October 10, http://dx.doi.org/10.2139/ssrn.4572453.Google Scholar
Guo MA, Shen ZJM (2024) Oligopoly price competitions under exogenous and endogenous reference effects. Preprint, submitted February 11, https://ssrn.com/abstract=4742001.Google Scholar
Guo MA, Jiang H, Shen ZJM (2022) Multi-product dynamic pricing with reference effects under logit demand. Preprint, submitted August 12, http://dx.doi.org/10.2139/ssrn.4189049.Google Scholar
Guo MA, Ying D, Lavaei J, Shen ZJ (2023) No-regret learning in dynamic competition with reference effects under logit demand. Oh A, Naumann T, Globerson A, Saenko K, Hardt M, Levine S, eds. Advances in Neural Information Processing Systems, vol. 36 (Curran Associates Inc., Red Hook, NY), 10567–10603.Google Scholar
Han Y, Weissman T, Zhou Z (2024) Optimal no-regret learning in repeated first-price auctions. Oper. Res. 73(1):209–238.Google Scholar
Hardie BG, Johnson EJ, Fader PS (1993) Modeling loss aversion and reference dependence effects on brand choice. Marketing Sci. 12(4):378–394.Link, Google Scholar
Jiang H, Cao J, Shen ZJM (2022) Intertemporal pricing via nonparametric estimation: Integrating reference effects and consumer heterogeneity. Manufacturing Service Oper. Management 26(1):28–46.Link, Google Scholar
Kahneman D, Tversky A (1979) Prospect theory: An analysis of decision under risk. Econometrica 47(2):263–292.Crossref, Google Scholar
Kalyanaram G, Winer RS (1995) Empirical generalizations from reference price research. Marketing Sci. 14(3 suppl):G161–G169.Link, Google Scholar
Khalil H (2002) Nonlinear Systems (Pearson Education, Prentice Hall, Saddle River, NJ).Google Scholar
Krishnamurthi L, Mazumdar T, Raj S (1992) Asymmetric response to price in consumer brand choice and purchase quantity decisions. J. Consumer Res. 19(3):387–400.Crossref, Google Scholar
Li J, So AMC, Ma WK (2020) Understanding notions of stationarity in nonsmooth optimization: A guided tour of various constructions of subdifferential for nonsmooth functions. IEEE Signal Processing Magazine 37(5):18–31.Crossref, Google Scholar
Lin T, Zhou Z, Mertikopoulos P, Jordan MI (2020) Finite-time last-iterate convergence for multi-agent learning in games. Daume H III, Singh A, eds. Internat. Conf. Machine Learn., vol. 119 (PMLR, New York), 6161–6171.Google Scholar
McFadden D (1974) Conditional logit analysis of qualitative choice behavior. Zarembka P, ed. Frontiers in Economics (Academic Press, New York), 105–142.Google Scholar
Mertikopoulos P, Zhou Z (2019) Learning in games with continuous action sets and unknown payoff functions. Math. Programming 173(1):465–507.Google Scholar
Mertikopoulos P, Papadimitriou C, Piliouras G (2018) Cycles in adversarial regularized learning. Czumaj A, ed. SODA ‘18: Sympos. Discrete Algorithms New Orleans Louisiana (SIAM, Philadelphia), 2703–2717.Google Scholar
Mertikopoulos P, Lecouat B, Zenati H, Foo C-S, Chandrasekhar V, Piliouras G (2019) Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile. ICLR (Ernest N. Morial Convention Center, New Orleans), 1–23.Google Scholar
Nesterov Y (2014) Introductory Lectures on Convex Optimization: A Basic Course, vol. 87 (Springer Science & Business Media, New York).Google Scholar
Palaiopanos G, Panageas I, Piliouras G (2017) Multiplicative weights update with constant step-size in congestion games: Convergence, limit cycles and chaos. Guyon I, Von Luxburg U, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 5874–5884.Google Scholar
Popescu I, Wu Y (2007) Dynamic pricing strategies with reference effects. Oper. Res. 55(3):413–429.Link, Google Scholar
Qin H, Simchi-Levi D, Wang L (2022) Data-driven approximation schemes for joint pricing and inventory control models. Management Sci. 68(9):6591–6609.Link, Google Scholar
Wang R (2018) When prospect theory meets consumer choice models: Assortment and pricing management with reference prices. Manufacturing Service Oper. Management 20(3):583–600.Link, Google Scholar

Volume 72, Issue 2

February 2026

Pages 783-1726, iv-vi

Article Information

Supplemental Material

Metrics

Information

Received:October 27, 2023
Accepted:February 14, 2025
Published Online:May 21, 2025

Cite as

Mengzi Amy Guo, Donghao Ying, Javad Lavaei, Zuo-Jun Max Shen (2025) Last-Iterate Convergence in No-Regret Learning: Games with Reference Effects Under Logit Demand. Management Science 72(2):1007-1024.

https://doi.org/10.1287/mnsc.2023.03464

Keywords

Acknowledgments

The authors thank Editor Chung Piaw Teo, the Associate Editor, and the three anonymous reviewers for their constructive feedback. The authors also want to thank the anonymous reviewers of NeurIPS 2023 for their valuable comments. The authors note that the algorithm and its convergence for the case of loss-neutral reference effects in duopoly competition (i.e., Theorems 1 and 2) were proposed in their conference proceeding paper (Guo et al. 2023). In this work, however, the authors not only improve the convergence rate and regret in the loss-neutral scenario but also extend all results to the oligopoly competition with potentially asymmetric reference effects.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Last-Iterate Convergence in No-Regret Learning: Games with Reference Effects Under Logit Demand

References

Volume 72, Issue 2

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News