Adaptive Learning in Uncertain and Sequential Competition

Published Online:https://doi.org/10.1287/opre.2024.0825

References

  • Aksoy-Pierson M, Allon G, Federgruen A (2013) Price competition under mixed multinomial logit demand functions. Management Sci. 59(8):1817–1835.LinkGoogle Scholar
  • Allon G, Federgruen A (2008) Service competition with general queueing facilities. Oper. Res. 56(4):827–849.LinkGoogle Scholar
  • Aouad A, den Boer AV (2021) Algorithmic collusion in assortment games. Preprint, submitted September 28, https://doi.org/10.2139/ssrn.3930364.Google Scholar
  • Ba W, Lin T, Zhang J, Zhou Z (2025) Doubly optimal no-regret online learning in strongly monotone games with bandit feedback. Oper. Res., ePub ahead of print January 3, https://doi.org/10.1287/opre.2021.0445.LinkGoogle Scholar
  • Balseiro S, Kroer C, Kumar R (2023) Contextual standard auctions with budgets: Revenue equivalence and efficiency guarantees. Management Sci. 69(11):6837–6854.LinkGoogle Scholar
  • Bertrand J (1883) Théorie mathématique de la richesse sociale. J. Des Savants 67(1883):499–508.Google Scholar
  • Besbes O, Muharremoglu A (2013) On implications of demand censoring in the newsvendor problem. Management Sci. 59(6):1407–1424.LinkGoogle Scholar
  • Besbes O, Sauré D (2016) Product assortment and price competition under multinomial logit demand. Production Oper. Management 25(1):114–127.Google Scholar
  • Besbes O, Gur Y, Zeevi A (2015) Non-stationary stochastic optimization. Oper. Res. 63(5):1227–1244.LinkGoogle Scholar
  • Bravo M, Leslie D, Mertikopoulos P (2018) Bandit learning in concave n-person games. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. Proc. 32nd Internat. Conf. Neural Inform. Processing Systems, Advances in Neural Information Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 5666–5676.Google Scholar
  • Broder J, Rusmevichientong P (2012) Dynamic pricing under a general parametric choice model. Oper. Res. 60(4):965–980.LinkGoogle Scholar
  • Cai Y, Zheng W (2023) Doubly optimal no-regret learning in monotone games. Krause A, Brunskill E, Cho K, Engelhardt B, Sabato S, Scarlett J, eds. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 3507–3524.Google Scholar
  • Cai Y, Oikonomou A, Zheng W (2022) Finite-time last-iterate convergence for learning in multi-player games. Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, eds. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY), 33904–33919.Google Scholar
  • Calvano E, Calzolari G, Denicolo V, Pastorello S (2020) Artificial intelligence, algorithmic pricing, and collusion. Amer. Econom. Rev. 110(10):3267–3297.CrossrefGoogle Scholar
  • Capponi A, Weber M (2024) Systemic portfolio diversification. Oper. Res. 72(1):110–131.Google Scholar
  • Chen N, Chen YJ (2021) Duopoly competition with network effects in discrete choice models. Oper. Res. 69(2):545–559.LinkGoogle Scholar
  • Chen B, Chao X, Shi C (2021) Nonparametric learning algorithms for joint pricing and inventory control with lost sales and censored demand. Math. Oper. Res. 46(2):726–756.LinkGoogle Scholar
  • Chen X, Wang Y, Wang YX (2019) Nonstationary stochastic optimization under l p, q-variation measures. Oper. Res. 67(6):1752–1765.LinkGoogle Scholar
  • Chen B, Jiang J, Zhang J, Zhou Z (2024) Learning to order for inventory systems with lost sales and uncertain supplies. Management Sci. 70(12):8631–8646.Google Scholar
  • Cohen MC, Zhang R (2022) Competition and coopetition for two-sided platforms. Production Oper. Management 31(5):1997–2014.CrossrefGoogle Scholar
  • Cooper WL, Homem-de Mello T, Kleywegt AJ (2015) Learning and pricing with models that do not explicitly incorporate competition. Oper. Res. 63(1):86–103.LinkGoogle Scholar
  • Cournot AA (1838) Recherches Sur Les Principes Mathématiques de la Théorie Des Richesses Par Augustin Cournot (Chez L. Hachette, Paris).Google Scholar
  • den Boer AV, Meylahn JM, Schinkel MP (2022) Artificial collusion: Examining supracompetitive pricing by q-learning algorithms. Research paper, Amsterdam Law School, Amsterdam.Google Scholar
  • Facchinei F, Kanzow C (2007) Generalized Nash equilibrium problems. 4OR 5:173–210.CrossrefGoogle Scholar
  • Fan X, Chen B, Xiao W, Zhou Z (2023) No-regret learning in multi-retailer inventory control. Preprint, submitted November 22, https://doi.org/10.2139/ssrn.4626023.Google Scholar
  • Federgruen A, Hu M (2015) Multi-product price and assortment competition. Oper. Res. 63(3):572–584.LinkGoogle Scholar
  • Federgruen A, Hu M (2016) Sequential multiproduct price competition in supply chain networks. Oper. Res. 64(1):135–149.LinkGoogle Scholar
  • Federgruen A, Hu M (2021) Global robust stability in a general price and assortment competition model. Oper. Res. 69(1):164–174.LinkGoogle Scholar
  • Federgruen A, Yang N (2009) Competition under generalized attraction models: Applications to quality competition under yield uncertainty. Management Sci. 55(12):2028–2043.LinkGoogle Scholar
  • Ferris M, Philpott A (2022) Dynamic risked equilibrium. Oper. Res. 70(3):1933–1952.LinkGoogle Scholar
  • Fournier G, Scarsini M (2019) Location games on networks: Existence and efficiency of equilibria. Math. Oper. Res. 44(1):212–235.AbstractGoogle Scholar
  • Gallego G, Hu M (2014) Dynamic pricing of perishable assets under competition. Management Sci. 60(5):1241–1259.LinkGoogle Scholar
  • Gallego G, Wang R (2014) Multiproduct price optimization and competition under the nested logit model with product-differentiated price sensitivities. Oper. Res. 62(2):450–461.LinkGoogle Scholar
  • Gallego G, Huh WT, Kang W, Phillips R (2006) Price competition with the attraction demand model: Existence of unique equilibrium and its stability. Manufacturing Service Oper. Management 8(4):359–375.LinkGoogle Scholar
  • Golowich N, Pattathil S, Daskalakis C (2020) Tight last-iterate convergence rates for no-regret learning in multi-player games. Adv. Neural Inform. Processing Systems 33:20766–20778.Google Scholar
  • Golrezaei N, Jaillet P, Liang JCN (2020) No-regret learning in price competitions under consumer reference effects. Adv. Neural Inform. Processing Systems 33:21416–21427.Google Scholar
  • Goyal V, Li S, Mehrotra S (2023) Learning to price under competition for multinomial logit demand. Preprint, submitted October 10, https://doi.org/10.2139/ssrn.4572453.Google Scholar
  • Guo MA, Ying D, Lavaei J, Shen ZJM (2023) Last-iterate convergence in no-regret learning: Games with reference effects under logit demand. Preprint, submitted November 7, https://doi.org/10.2139/ssrn.4597658.Google Scholar
  • Gur Y, Saban D, Stier-Moses NE (2018) The competitive facility location problem in a duopoly: Advances beyond trees. Oper. Res. 66(4):1058–1067.LinkGoogle Scholar
  • Hansen KT, Misra K, Pai MM (2021) Frontiers: Algorithmic collusion: Supra-competitive prices via independent algorithms. Marketing Sci. 40(1):1–12.LinkGoogle Scholar
  • Hazan E, Levy K (2014) Bandit convex optimization: Towards tight bounds. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, eds. Adv. Neural Inform. Processing Systems, vol. 27 (MIT Press, Cambridge, MA), 784–792.Google Scholar
  • Héliou A, Mertikopoulos P, Zhou Z (2020) Gradient-free online learning in continuous games with delayed rewards. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 4172–4181.Google Scholar
  • Hsieh YG, Antonakopoulos K, Mertikopoulos P (2021) Adaptive learning in continuous games: Optimal regret bounds and convergence to Nash equilibrium. Proc. Conf. Learn. Theory (PMLR, New York), 2388–2422.Google Scholar
  • Hsieh YG, Antonakopoulos K, Cevher V, Mertikopoulos P (2022) No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation. Adv. Neural Inform. Processing Systems 35:6544–6556.Google Scholar
  • Huh WT, Rusmevichientong P (2009) A nonparametric asymptotic analysis of inventory planning with censored demand. Math. Oper. Res. 34(1):103–123.LinkGoogle Scholar
  • Javanmard A (2017) Perishability of data: Dynamic pricing under varying-coefficient models. J. Machine Learn. Res. 18(1):1714–1744.Google Scholar
  • Jordan M, Lin T, Zhou Z (2025) Adaptive, doubly optimal no-regret learning in strongly monotone and exp-concave games with gradient feedback. Oper. Res. 73(3):1675–1702.Google Scholar
  • Kirman AP (1975) Learning by firms about demand conditions. Day RH, Graves T, eds. Adaptive Economic Models (Academic Press, New York), 137–156.CrossrefGoogle Scholar
  • Kirman A (1983) On mistaken beliefs and resultant equilibria. Frydman R, Phelps ES, eds. Individual Forecasting and Aggregate Outcomes (Cambridge University Press, New York), 147–166.Google Scholar
  • Klein T (2018) Assessing autonomous algorithmic collusion: Q-learning under short-run price commitments. Technical report, Tinbergen Institute, Amsterdam.Google Scholar
  • Li H, Huh WT (2011) Pricing multiple products with the multinomial logit and nested logit models: Concavity and implications. Manufacturing Service Oper. Management 13(4):549–563.LinkGoogle Scholar
  • Lin T, Zhou Z, Mertikopoulos P, Jordan M (2020) Finite-time last-iterate convergence for multi-agent learning in games. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 6161–6171.Google Scholar
  • Loots T, den Boer AV (2023) Data-driven collusion and competition in a pricing duopoly with multinomial logit demand. Production Oper. Management 32(4):1169–1186.CrossrefGoogle Scholar
  • Mertikopoulos P, Zhou Z (2019) Learning in games with continuous action sets and unknown payoff functions. Math. Programming 173:465–507.CrossrefGoogle Scholar
  • Meylahn JM, den Boer AV (2022) Learning to collude in a pricing duopoly. Manufacturing Service Oper. Management 24(5):2577–2594.LinkGoogle Scholar
  • Netessine S, Rudi N (2003) Centralized and competitive inventory models with demand substitution. Oper. Res. 51(2):329–335.LinkGoogle Scholar
  • Parker W (2024) Big cities take up fight against algorithm-based rents. Accessed February 18, 2024, https://www.wsj.com/real-estate/big-cities-take-up-fight-against-algorithm-based-rents-e55f3aa1.Google Scholar
  • Schied A, Zhang T (2019) A market impact game under transient price impact. Math. Oper. Res. 44(1):102–121.AbstractGoogle Scholar
  • Shi C, Chen W, Duenyas I (2016) Nonparametric data-driven algorithms for multiproduct inventory systems with censored demand. Oper. Res. 64(2):362–370.LinkGoogle Scholar
  • Song JS, Xue Z (2021) Demand shaping through bundling and product configuration: A dynamic multiproduct inventory-pricing model. Oper. Res. 69(2):525–544.LinkGoogle Scholar
  • Talluri KT, Van Ryzin GJ (2004) The Theory and Practice of Revenue Management (Kluwer Academic Publishers, Boston).CrossrefGoogle Scholar
  • Tesauro G, Kephart JO (2002) Pricing in agent economies using multi-agent q-learning. Autonomous Agents Multi-Agent Systems 5:289–304.CrossrefGoogle Scholar
  • Waltman L, Kaymak U (2008) Learning agents in a Cournot oligopoly model. J. Econom. Dynamic Control 32(10):3275–3293.CrossrefGoogle Scholar
  • Yang C, Hu Z, Zhou SX (2021) Multilocation newsvendor problem: Centralization and inventory pooling. Management Sci. 67(1):185–200.LinkGoogle Scholar
  • Yuan H, Luo Q, Shi C (2021) Marrying stochastic gradient descent with bandits: Learning algorithms for inventory systems with fixed costs. Management Sci. 67(10):6089–6115.LinkGoogle Scholar
  • Zhang H, Chao X, Shi C (2020) Closing the gap: A learning algorithm for lost-sales inventory systems with lead times. Management Sci. 66(5):1962–1980.LinkGoogle Scholar
  • Zhou Z, Mertikopoulos P, Bambos N, Glynn PW, Tomlin C (2017a) Countering feedback delays in multi-agent learning. von Luxburg U, Guyon I, Bengio S, Wallach H, Fergus R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates Inc., Red Hook, NY), 6172–6182.Google Scholar
  • Zhou Z, Mertikopoulos P, Moustakas AL, Bambos N, Glynn P (2017b) Mirror descent learning in continuous games. Proc. IEEE 56th Annual Conf. Decision Control (IEEE, Piscataway, NJ), 5776–5783.Google Scholar
  • Zhou Z, Mertikopoulos P, Moustakas AL, Bambos N, Glynn P (2021) Robust power management via learning and game design. Oper. Res. 69(1):331–345.LinkGoogle Scholar
  • Zhou Z, Mertikopoulos P, Athey S, Bambos N, Glynn PW, Ye Y (2018) Learning in games with lossy feedback. Bengio S, Wallach HM, Larochelle H, Grauman K, Cesa-Bianchi N, eds. Adv. Neural Inform. Processing Systems, vol. 31 (Curran Associates Inc., Red Hook, NY), 5140–5150.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.