Optimality of Symmetric Independent Policies Under Decentralized Mean-Field Information Sharing for Stochastic Teams and Equivalence with McKean−Vlasov Control of a Representative Agent

Published Online:https://doi.org/10.1287/moor.2024.0489

References

  • [1] Achdou Y, Laurière M (2020) Mean field games and applications: Numerical aspects. Cardaliaguet P, Porretta A, eds. Mean Field Games, Lecture Notes in Mathematics, vol. 2281 (Springer, Cham, Switzerland), 249–307.CrossrefGoogle Scholar
  • [2] Albi G, Choi Y, Fornasier M, Kalise D (2017) Mean field control hierarchy. Appl. Math. Optim. 76(1):93–135.CrossrefGoogle Scholar
  • [3] Albi G, Herty M, Kalise D, Segala C (2022) Moment-driven predictive control of mean-field collective dynamics. SIAM J. Control Optim. 60(2):814–841.CrossrefGoogle Scholar
  • [4] Aldous DJ, Ibragimov IA, Jacod J (1985) Ecole d’Ete de Probabilites de Saint-Flour XIII, 1983, Lecture Notes in Mathematics, vol. 1117 (Springer, Berlin).Google Scholar
  • [5] Anahtarci B, Kariksiz C, Saldi N (2023) Learning mean-field games with discounted and average costs. J. Machine Learn. Res. 24(17):1–59.Google Scholar
  • [6] Arabneydi J, Mahajan A (2014) Team optimal control of coupled subsystems with mean-field sharing. 53rd IEEE Conf. Decision Control, 1669–1674. Google Scholar
  • [7] Arabneydi J, Mahajan A (2015) Team-optimal solution of finite number of mean-field coupled LQG subsystems. IEEE 54th Annual Conf. Decision Control (CDC), 5308–5313.Google Scholar
  • [8] Balder E (1997) Consequences of denseness of Dirac Young measures. J. Math. Anal. Appl. 207(2):536–540.CrossrefGoogle Scholar
  • [9] Bardi M, Fischer M (2019) On non-uniqueness and uniqueness of solutions in finite-horizon mean field games. ESAIM COCV 25:44.CrossrefGoogle Scholar
  • [10] Bäuerle N (2023) Mean field Markov decision processes. Appl. Math. Optim. 88(1):12.CrossrefGoogle Scholar
  • [11] Bayraktar E, Zhang X (2020) On non-uniqueness in mean field games. Proc. Amer. Math. Soc. 148(9):4091–4106.CrossrefGoogle Scholar
  • [12] Bayraktar E, Bäuerle N, Kara AD (2025) Finite approximations for mean-field type multi-agent control and their near optimality. Appl. Math. Optim. 92(1):7.CrossrefGoogle Scholar
  • [13] Bayraktar E, Cosso A, Pham H (2018) Randomized dynamic programming principle and Feynman–Kac representation for optimal control of McKean–Vlasov dynamics. Trans. Amer. Math. Soc. 370(3):2115–2160.CrossrefGoogle Scholar
  • [14] Beiglböck M, Lacker D (2018) Denseness of adapted processes among causal couplings. Preprint, submitted May 8, https://arxiv.org/abs/1805.03185.Google Scholar
  • [15] Bensoussan A, Frehse J, Yam S (2015) The master equation in mean field theory. J. Mathématiques Pures Appliquées 103(6):1441–1474.CrossrefGoogle Scholar
  • [16] Borkar V (1988) The probabilistic structure of controlled diffusion processes. Acta Appl. Math. 11(1):19–48.CrossrefGoogle Scholar
  • [17] Caines PE, Huang M, Malhamé RP (2017) Mean field games. Basar T, Zaccour G, eds. Theory Handbook of Dynamic Game Theory (Springer, Cham, Switzerland), 1–28.CrossrefGoogle Scholar
  • [18] Cardaliaguet P, Daudin S, Jackson J, Souganidis P (2023) An algebraic convergence rate for the optimal control of McKean–Vlasov dynamics. SIAM J. Control Optim. 61(6):3341–3369.CrossrefGoogle Scholar
  • [19] Carmona R (2020) Applications of mean field games in financial engineering and economic theory. Preprint, submitted December 9, https://arxiv.org/abs/2012.05237.Google Scholar
  • [20] Carmona R, Delarue F (2015) Forward–backward stochastic differential equations and controlled McKean–Vlasov dynamics. Ann. Probab. 43(5):2647–2700.CrossrefGoogle Scholar
  • [21] Carmona R, Delarue F (2018) Probabilistic Theory of Mean Field Games with Applications II: Mean Field Games with Common Noise and Master Equations (Springer, Cham, Switzerland).CrossrefGoogle Scholar
  • [22] Carmona R, Delarue F, Lacker D (2016) Mean field games with common noise. Ann. Probab. 44(6):3740–3803.CrossrefGoogle Scholar
  • [23] Carmona R, Laurière M, Tan Z (2023) Model-free mean-field reinforcement learning: Mean-field MDP and mean-field Q-learning. Ann. Appl. Probab. 33(6B):5334–5381.CrossrefGoogle Scholar
  • [24] Carrillo J, Rossi DKF, Trélat E (2022) Controlling swarms toward flocks and mills. SIAM J. Control Optim. 60(3):1863–1891.CrossrefGoogle Scholar
  • [25] Castaing C, Fitte PR, Valadier M (2004) Young Measures on Topological Spaces: With Applications in Control Theory and Probability Theory, vol. 571 (Springer Science & Business Media, Dordrecht, Netherlands).CrossrefGoogle Scholar
  • [26] Cecchin A (2021) Finite state N-agent and mean field control problems. ESAIM COCV 27:31.CrossrefGoogle Scholar
  • [27] Cecchin A, Fischer M (2020) Probabilistic approach to finite state mean field games. Appl. Math. Optim. 81(2):253–300.Google Scholar
  • [28] Delarue F, Tchuendom R (2020) Selection of equilibria in a linear quadratic mean-field game. Stochastic Processes Appl. 130(2):1000–1040.CrossrefGoogle Scholar
  • [29] Diaconis P, Freedman D (1980) Finite exchangeable sequences. Ann. Probab. 8(4):745–764.CrossrefGoogle Scholar
  • [30] Djete M, Possamaï D, Tan X (2022) McKean–Vlasov optimal control: Limit theory and equivalence between different formulations. Math. Oper. Res. 47(4):2891–2930.LinkGoogle Scholar
  • [31] Elliott R, Li X, Ni Y (2013) Discrete time mean-field stochastic linear-quadratic optimal control problems. Automatica 49(11):3222–3233.CrossrefGoogle Scholar
  • [32] Fischer M (2017) On the connection between symmetric N-player games and mean field games. Ann. Appl. Probab. 27(2):757–810.CrossrefGoogle Scholar
  • [33] Fornasier M, Lisini S, Orrieri C, Savaré G (2019) Mean-field optimal control as gamma-limit of finite agent controls. Eur. J. Appl. Math. 30(6):1153–1186.CrossrefGoogle Scholar
  • [34] Hajek B, Livesay M (2019) On non-unique solutions in mean field games. 2019 IEEE 58th Conf. Decision Control (CDC), 1219–1224.Google Scholar
  • [35] Hernández-Lerma O, Lasserre JB (1996) Discrete-Time Markov Control Processes: Basic Optimality Criteria (Springer, New York).CrossrefGoogle Scholar
  • [36] Hespanha J, Naghshtabrizi P, Xu Y (2007) A survey of recent results in networked control systems. Proc. IEEE 95(1):138–162.CrossrefGoogle Scholar
  • [37] Ho Y (1980) Team decision theory and information structures. Proc. IEEE 68(6):644–654.CrossrefGoogle Scholar
  • [38] Huang M, Caines P, Malhamé R (2012) Social optima in mean field LQG control: Centralized and decentralized strategies. IEEE Trans. Automatic Control 57(7):1736–1751.CrossrefGoogle Scholar
  • [39] Huang M, Caines PE, Malhamé RP (2006) Large population stochastic dynamic games: Closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Comm. Inform. Systems 6(3):221–251.CrossrefGoogle Scholar
  • [40] Huang M, Caines PE, Malhamé RP (2007) Large-population cost-coupled LQG problems with nonuniform agents: Individual-mass behavior and decentralized ϵ-Nash equilibria. IEEE Trans. Automatic Control 52(9):1560–1571.CrossrefGoogle Scholar
  • [41] Jackson J, Lacker D (2025) Approximately optimal distributed stochastic controls beyond the mean field setting. Ann. Appl. Probab. 35(1):251–308.CrossrefGoogle Scholar
  • [42] Kallenberg O (2006) Probabilistic Symmetries and Invariance Principles (Springer Science & Business Media, New York).Google Scholar
  • [43] Kara A, Yüksel S (2020) Robustness to incorrect system models in stochastic control. SIAM J. Control Optim. 58(2):1144–1182.CrossrefGoogle Scholar
  • [44] Kara A, Saldi N, Yüksel S (2023) Q-learning for MDPs with general spaces: Convergence and near optimality via quantization under weak continuity. J. Machine Learn. Res. 24(199):1–34.Google Scholar
  • [45] Lacker D (2015) Mean field games via controlled martingale problems: Existence of Markovian equilibria. Stochastic Processes Appl. 125(7):2856–2894.CrossrefGoogle Scholar
  • [46] Lacker D (2017) Limit theory for controlled McKean–Vlasov dynamics. SIAM J. Control Optim. 55(3):1641–1672.CrossrefGoogle Scholar
  • [47] Lacker D (2020) On the convergence of closed-loop Nash equilibria to the mean field game limit. Ann. Appl. Probab. 30(4):1693–1761.CrossrefGoogle Scholar
  • [48] Langen HJ (1981) Convergence of dynamic programming models. Math. Oper. Res. 6(4):493–512.LinkGoogle Scholar
  • [49] Lasry JM, Lions PL (2007) Mean field games. Japanese J. Math. 2(1):229–260.CrossrefGoogle Scholar
  • [50] Laurière M, Pironneau O (2016) Dynamic programming for mean-field type control. J. Optim. Theory Appl. 169(3):902–924.CrossrefGoogle Scholar
  • [51] Mahajan A, Martins N, Rotkowitz M, Yüksel S (2012) Information structures in optimal decentralized control. IEEE Conf. Decision Control.Google Scholar
  • [52] Marschak J (1955) Elements for a theory of teams. Management Sci. 1(2):127–137.LinkGoogle Scholar
  • [53] Milgrom P, Weber R (1985) Distributional strategies for games with incomplete information. Math. Oper. Res. 10(4):619–632.LinkGoogle Scholar
  • [54] Motte M, Pham H (2022) Mean-field Markov decision processes with common noise and open-loop controls. Ann. Appl. Probab. 32(2):1421–1458.CrossrefGoogle Scholar
  • [55] Motte M, Pham H (2023) Quantitative propagation of chaos for mean field Markov decision process with common noise. Electronic J. Probab. 28:1–24.CrossrefGoogle Scholar
  • [56] Ni Y, Elliott R, Li X (2015) Discrete-time mean-field stochastic linear–quadratic optimal control problems, II: Infinite horizon case. Automatica 57:65–77.CrossrefGoogle Scholar
  • [57] Pham H, Wei X (2016) Discrete time McKean–Vlasov control problem: A dynamic programming approach. Appl. Math. Optim. 74(3):487–506.CrossrefGoogle Scholar
  • [58] Pham H, Wei X (2017) Dynamic programming for optimal control of stochastic McKean–Vlasov dynamics. SIAM J. Control Optim. 55(2):1069–1101.CrossrefGoogle Scholar
  • [59] Radner R (1962) Team decision problems. Ann. Math. Statist. 33(3):857–881.CrossrefGoogle Scholar
  • [60] Saldi N, Başar T, Raginsky M (2018) Markov–Nash equilibria in mean-field games with discounted cost. SIAM J. Control Optim. 56(6):4256–4287.CrossrefGoogle Scholar
  • [61] Saldi N, Linder T, Yüksel S (2018) Finite Approximations in Discrete-Time Stochastic Control: Quantized Models and Asymptotic Optimality (Birkhäuser, Cham, Switzerland).CrossrefGoogle Scholar
  • [62] Saldi N, Yüksel S, Linder T (2017) On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces. Math. Oper. Res. 42(4):945–978.LinkGoogle Scholar
  • [63] Sanjari S, Yüksel S (2021) Optimal policies for convex symmetric stochastic dynamic teams and their mean-field limit. SIAM J. Control Optim. 59(2):777–804.CrossrefGoogle Scholar
  • [64] Sanjari S, Yüksel S (2021) Optimal solutions to infinite-player stochastic teams and mean-field teams. IEEE Trans. Automatic Control. 66(3):1071–1086.CrossrefGoogle Scholar
  • [65] Sanjari S, Saldi N, Yüksel S (2023) Optimality of independently randomized symmetric policies for exchangeable stochastic teams with infinitely many decision makers. Math. Oper. Res. 48(3):1254–1285.LinkGoogle Scholar
  • [66] Sanjari S, Saldi N, Yüksel S (2024) Nash equilibria for exchangeable team-against-team games, their mean-field limit, and the role of common randomness. SIAM J. Control Optim. 62(3):1437–1464.CrossrefGoogle Scholar
  • [67] Serfozo R (1982) Convergence of Lebesgue integrals with varying measures. Sankhyā Indian J. Statist. Ser. A 44(3):380–402.Google Scholar
  • [68] Subramanian J, Kumar A, Mahajan A (2023) Mean-field games among teams. Preprint, submitted October 18, https://arxiv.org/abs/2310.12282.Google Scholar
  • [69] Toumi N, Malhamé R, Ny JL (2024) A mean field game approach for a class of linear quadratic discrete choice problems with congestion avoidance. Automatica 160:111420.CrossrefGoogle Scholar
  • [70] Tsitsiklis J (1988) Decentralized detection by a large number of sensors. Math. Control. Signals Systems 1(2):167–182.CrossrefGoogle Scholar
  • [71] Witsenhausen HS (1975) The intrinsic model for discrete stochastic control: Some open problems. Bensoussan A, Lions JL, eds. Control Theory, Numerical Methods and Computer Systems Modelling, Lecture Notes in Economics and Mathematical Systems, vol. 107 (Springer, Berlin), 322–335.CrossrefGoogle Scholar
  • [72] Yongacoglu B, Arslan G, Yüksel S (2024) Mean-field games with finitely many players: Independent learning and subjectivity. J. Machine Learn. Res. 25(419):1–69.Google Scholar
  • [73] Yüksel S (2024) On Borkar and Young relaxed control topologies and continuous dependence of invariant measures on control policy. SIAM J. Control Optim. 62(4):2367–2386.CrossrefGoogle Scholar
  • [74] Yüksel S, Başar T (2013) Stochastic Networked Control Systems: Stabilization and Optimization Under Information Constraints (Springer, New York).CrossrefGoogle Scholar
  • [75] Yüksel S, Başar T (2024) Stochastic Teams, Games, and Control Under Information Constraints (Springer, Cham, Switzerland).CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.