Convergence and Stability of Coupled Belief-Strategy Learning Dynamics in Continuous Games

Published Online:https://doi.org/10.1287/moor.2022.0161

References

  • [1] Acemoglu D, Bimpikis K, Ozdaglar A (2014) Dynamics of information exchange in endogenous social networks. Theor. Econom. 9(1):41–97.CrossrefGoogle Scholar
  • [2] Acemoglu D, Makhdoumi A, Malekian A, Ozdaglar A (2017) Fast and slow learning from reviews. NBER Working Paper 24046, National Bureau of Economic Research, Cambridge, MA.Google Scholar
  • [3] Alós-Ferrer C, Netzer N (2010) The logit-response dynamics. Games Econom. Behav. 68(2):413–427.CrossrefGoogle Scholar
  • [4] Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learn. 47(2):235–256.CrossrefGoogle Scholar
  • [5] Banerjee AV (1992) A simple model of herd behavior. Quart. J. Econom. 107(3):797–817.CrossrefGoogle Scholar
  • [6] Beggs AW (2005) On the convergence of reinforcement learning. J. Econom. Theory 122(1):1–36.CrossrefGoogle Scholar
  • [7] Benaim M, Hirsch MW (1999) Mixed equilibria and dynamical systems arising from fictitious play in perturbed games. Games Econom. Behav. 29(1–2):36–72.CrossrefGoogle Scholar
  • [8] Blume LE (1993) The statistical mechanics of strategic interaction. Games Econom. Behav. 5(3):387–424.CrossrefGoogle Scholar
  • [9] Bravo M, Leslie D, Mertikopoulos P (2018) Bandit learning in concave n-person games. Adv. Neural Inform. Processing Systems 31.Google Scholar
  • [10] Cesa-Bianchi N, Lugosi G (2006) Prediction, Learning, and Games (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • [11] Cominetti R, Melo E, Sorin S (2010) A payoff-based learning procedure and its application to traffic games. Games Econom. Behav. 70(1):71–83.CrossrefGoogle Scholar
  • [12] Daskalakis C, Deckelbaum A, Kim A (2011) Near-optimal no-regret algorithms for zero-sum games. Proc. Twenty-Second Annu. ACM-SIAM Sympos. Discrete Algorithms (SIAM, Philadelphia), 235–254.Google Scholar
  • [13] Dumett MA, Cominetti R (2018) On the stability of an adaptive learning dynamics in traffic games. Preprint, submitted July 3, https://arxiv.org/abs/1807.01256.Google Scholar
  • [14] Foster D, Young HP (2006) Regret testing: Learning to play Nash equilibrium without knowing you have an opponent. Theor. Econom. 1(3):341–367.Google Scholar
  • [15] Fudenberg D, Kreps DM (1993) Learning mixed equilibria. Games Econom. Behav. 5(3):320–367.CrossrefGoogle Scholar
  • [16] Fudenberg D, Kreps DM (1995) Learning in extensive-form games I. Self-confirming equilibria. Games Econom. Behav. 8(1):20–55.CrossrefGoogle Scholar
  • [17] Fudenberg D, Levine DK (1993) Self-confirming equilibrium. Econometrica 61(3):523–545.CrossrefGoogle Scholar
  • [18] Fudenberg D, Tirole J (1991) Game Theory (MIT Press, Cambridge, MA).Google Scholar
  • [19] Gale D, Kariv S (2003) Bayesian learning in social networks. Games Econom. Behav. 45(2):329–346.CrossrefGoogle Scholar
  • [20] Golowich N, Pattathil S, Daskalakis C (2020) Tight last-iterate convergence rates for no-regret learning in multi-player games. Adv. Neural Inform. Processing Systems 33:20766–20778.Google Scholar
  • [21] Golub B, Jackson MO (2010) Naive learning in social networks and the wisdom of crowds. Amer. Econom. J. Microeconom. 2(1):112–149.CrossrefGoogle Scholar
  • [22] Hart S, Mas-Colell A (2000) A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5):1127–1150.CrossrefGoogle Scholar
  • [23] Hart S, Mas-Colell A (2003) Regret-based continuous-time dynamics. Games Econom. Behav. 45(2):375–394.CrossrefGoogle Scholar
  • [24] Hofbauer J, Sandholm WH (2002) On the global convergence of stochastic fictitious play. Econometrica 70(6):2265–2294.CrossrefGoogle Scholar
  • [25] Hofbauer J, Sandholm WH (2009) Stable games and their dynamics. J. Econom. Theory 144(4):1665–1693.CrossrefGoogle Scholar
  • [26] Hofbauer J, Sorin S (2006) Best response dynamics for continuous zero-sum games. Discrete Continuous Dynam. Systems Ser. B 6(1):215–224.CrossrefGoogle Scholar
  • [27] Hopkins E (2002) Two competing models of how people learn in games. Econometrica 70(6):2141–2166.CrossrefGoogle Scholar
  • [28] Lattimore T, Szepesvári C (2020) Bandit Algorithms (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • [29] Marden JR, Shamma JS (2012) Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation. Games Econom. Behav. 75(2):788–808.CrossrefGoogle Scholar
  • [30] Marden JR, Arslan G, Shamma JS (2007) Regret based dynamics: Convergence in weakly acyclic games. Proc. 6th Internat. Joint Conf. Autonomous Agents Multiagent Systems (Association for Computing Machinery, New York), 42.Google Scholar
  • [31] Marden JR, Young HP, Arslan G, Shamma JS (2009) Payoff-based dynamics for multiplayer weakly acyclic games. SIAM J. Control Optim. 48(1):373–396.CrossrefGoogle Scholar
  • [32] Matsui A (1992) Best response dynamics and socially stable strategies. J. Econom. Theory 57(2):343–362.CrossrefGoogle Scholar
  • [33] Meigs E, Parise F, Ozdaglar A (2017) Learning dynamics in stochastic routing games. 2017 55th Annu. Allerton Conf. Commun. Control Comput. (IEEE, Piscataway, NJ), 259–266.Google Scholar
  • [34] Mertikopoulos P, Zhou Z (2019) Learning in games with continuous action sets and unknown payoff functions. Math. Program. 173(1):465–507.CrossrefGoogle Scholar
  • [35] Milgrom P, Roberts J (1990) Rationalizability, learning, and equilibrium in games with strategic complementarities. Econometrica 58(6):1255–1277.CrossrefGoogle Scholar
  • [36] Moe WW, Fader PS (2004) Dynamic conversion behavior at e-commerce sites. Management Sci. 50(3):326–335.LinkGoogle Scholar
  • [37] Monderer D, Shapley LS (1996a) Fictitious play property for games with identical interests. J. Econom. Theory 68(1):258–265.CrossrefGoogle Scholar
  • [38] Monderer D, Shapley LS (1996b) Potential games. Games Econom. Behav. 14(1):124–143.CrossrefGoogle Scholar
  • [39] Mossel E, Sly A, Tamuz O (2015) Strategic learning and the topology of social networks. Econometrica 83(5):1755–1794.CrossrefGoogle Scholar
  • [40] Rosen JB (1965) Existence and uniqueness of equilibrium points for concave n-person games. Econometrica 33(3):520–534.CrossrefGoogle Scholar
  • [41] Samuelson L, Zhang J (1992) Evolutionary stability in asymmetric games. J. Econom. Theory 57(2):363–391.CrossrefGoogle Scholar
  • [42] Sandholm WH (2010a) Local stability under evolutionary game dynamics. Theor. Econom. 5(1):27–50.CrossrefGoogle Scholar
  • [43] Sandholm WH (2010b) Population Games and Evolutionary Dynamics (MIT Press, Cambridge, MA).Google Scholar
  • [44] Smith JM, Price GR (1973) The logic of animal conflict. Nature 246(5427):15–18.CrossrefGoogle Scholar
  • [45] Syrgkanis V, Agarwal A, Luo H, Schapire RE (2015) Fast convergence of regularized learning in games. Adv. Neural Inform. Processing Systems 28.Google Scholar
  • [46] Taylor PD, Jonker LB (1978) Evolutionary stable strategies and game dynamics. Math. Biosci. 40(1–2):145–156.CrossrefGoogle Scholar
  • [47] Wu M, Amin S, Ozdaglar AE (2021) Value of information in Bayesian routing games. Oper. Res. 69(1):148–163.LinkGoogle Scholar
  • [48] Zhu S, Levinson D, Liu HX, Harder K (2010) The traffic and behavioral effects of the I-35W Mississippi River bridge collapse. Transportation Res. Part A Policy Pract. 44(10):771–784.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.