[1] Ambrosio L, Gigli N, Savaré G (2008) Gradient Flows: In Metric Spaces and in the Space of Probability Measures (Birkhäuser, Basel).Google Scholar
[2] Andrews I, Gentzkow M, Shapiro JM (2017) Measuring the sensitivity of parameter estimates to estimation moments. Quart. J. Econom. 132(4):1553–1592.Crossref, Google Scholar
[3] Bartle RG (1976) The Elements of Real Analysis (Wiley, New York).Google Scholar
[4] Billingsley P (1999) Convergence of Probability Measures, Wiley Series in Probability and Statistics, 2nd ed. (Wiley, New York).Crossref, Google Scholar
[5] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.Crossref, Google Scholar
[6] Boyd N, Schiebinger G, Recht B (2017) The alternating descent conditional gradient method for sparse inverse problems. SIAM J. Optim. 27(2):616–639.Crossref, Google Scholar
[7] Bredies K, Pikkarainen H (2013) Inverse problems in spaces of measures. ESAIM: Control Optim. Calculus Variations 19(1):190–218.Crossref, Google Scholar
[8] Bubeck S (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3–4):231–357.Crossref, Google Scholar
[9] Candès EJ, Fernandez-Granda C (2013) Super-resolution from noisy data. J. Fourier Anal. Appl. 19(6):1229–1254.Crossref, Google Scholar
[10] Candès EJ, Fernandez-Granda C (2014) Towards a mathematical theory of super-resolution. Comm. Pure Appl. Math. 67(6):906–956.Crossref, Google Scholar
[11] Canuto C, Urban K (2005) Adaptive optimization of convex functionals in Banach spaces. SIAM J. Numer. Anal. 42(5):2043–2075.Crossref, Google Scholar
[12] Carrillo JA, Craig K, Patacchini FS (2019) A blob method for diffusion. Calculus Variations Partial Differential Equations 58(2):53.Crossref, Google Scholar
[13] Carrillo JA, Craig K, Wang L, Wei C (2022) Primal dual methods for Wasserstein gradient flows. Foundations Comput. Math. 22(2):389–443.Crossref, Google Scholar
[14] Chernozhukov V, Escanciano JC, Ichimura H, Newey WK, Robins JM (2022) Locally robust semiparametric estimation. Econometrica 90(4):1501–1535.Crossref, Google Scholar
[15] Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J (2018) Double/debiased machine learning for treatment and structural parameters. Econometrics J. 21(1):C1–C68. Crossref, Google Scholar
[16] Chizat L (2022) Convergence rates of gradient methods for convex optimization in the space of measures. Open J. Math. Optim. 3:8.Google Scholar
[17] Chizat L (2022) Sparse optimization on measures with over-parameterized gradient descent. Math. Programming 194(1–2):487–532.Crossref, Google Scholar
[18] Chu C, Blanchet J, Glynn P (2019) Probability functional descent: A unifying perspective on GANs, variational inference, and reinforcement learning. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 1213–1222.Google Scholar
[19] Daskin M (1983) A maximum expected covering location model: Formulation, properties, and heuristic solution. Transportation Sci. 17(1):48–70.Link, Google Scholar
[20] De Castro Y, Gamboa F (2012) Exact reconstruction using Beurling minimal extrapolation. J. Math. Anal. Appl. 395(1):336–354.Crossref, Google Scholar
[21] Demynov VF, Rubinov AM (1970) Approximate Methods in Optimization Problems (Elsevier, New York).Google Scholar
[22] Dunn JC, Harshbarger S (1978) Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2):432–444.Crossref, Google Scholar
[23] Durrett R (2019) Probability: Theory and Examples, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 49 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
[24] Eftekhari A, Thompson A (2019) Sparse inverse problems over measures: Equivalence of the conditional gradient and exchange methods. SIAM J. Optim. 29(2):1329–1349. Crossref, Google Scholar
[25] Fedorov VV, Hackl P (1997) Model-Oriented Design of Experiments, Lecture Notes in Statistics, vol. 125 (Springer, New York).Crossref, Google Scholar
[26] Fernandez-Granda C (2013) Support detection in super-resolution. Preprint, submitted February 16, https://arxiv.org/abs/1302.3921.Google Scholar
[27] Fernholz LT (1983) von Mises Calculus for Statistical Functionals, Lecture Notes in Statistics, vol. 19 (Springer, New York). Crossref, Google Scholar
[28] Fernholz LT (2011) Functional derivatives in statistics: Asymptotics and robustness. Lovric M, ed. International Encyclopedia of Statistical Science (Springer, Berlin, Heidelberg).Google Scholar
[29] Firpo S, Fortin NM, Lemieux T (2009) Unconditional quantile regressions. Econometrica 77(3):953–973.Crossref, Google Scholar
[30] Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.Crossref, Google Scholar
[31] Futami F, Cui Z, Sato I, Sugiyama M (2019) Bayesian posterior approximation via greedy particle optimization. Proc. AAAI Conf. Artificial Intelligence 33(1):3606–3613.Google Scholar
[32] Garreis S, Ulbrich M (2019) An inexact trust-region algorithm for constrained problems in Hilbert space and its application to the adaptive solution of optimal control problems with PDEs. Working paper, Technical University of Munich, Munich, Germany.Google Scholar
[33] Geer SA (2000) Empirical Processes in M-Estimation, vol. 6 (Cambridge University Press, Cambridge, UK).Google Scholar
[34] Hampel FR (1974) The influence curve and its role in robust estimation. J. Amer. Statist. Assoc. 69(346):383–393.Crossref, Google Scholar
[35] Henderson SG, van den Berg PL, Jagtenberg CJ, Li H (2022) How should volunteers be dispatched to out-of-hospital cardiac arrest cases? Queueing Systems 100(3–4):437–439.Crossref, Google Scholar
[36] Huber PJ (1996) Robust Statistical Procedures, CBMS-NSF Regional Conference Series in Applied Mathematics, 2nd ed., vol. 68 (Society for Industrial and Applied Mathematics, Philadelphia).Google Scholar
[37] Jin C, Zhang Y, Balakrishnan S, Wainwright MJ, Jordan MI (2016) Local maxima in the likelihood of Gaussian mixture models: Structural results and algorithmic consequences. Advances in Neural Information Processing Systems, 29.Google Scholar
[38] Kent C, Blanchet J, Glynn P (2021) Frank-Wolfe methods in probability space. Preprint, submitted May 11, https://arxiv.org/abs/2105.05352.Google Scholar
[39] Kiefer J (1974) General equivalence theory for optimum designs (approximate theory). Ann. Statist. 2(5):849–879.Crossref, Google Scholar
[40] Lacoste-Julien S (2016) Convergence rate of Frank-Wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345.Google Scholar
[41] Liu Q, Wang D (2016) Stein variational gradient descent: A general purpose Bayesian inference algorithm. Advances in Neural Information Processing Systems, 29 (Curran Associates Inc., Red Hook, NY).Google Scholar
[42] Magnus JR, Neudecker H (2019) Matrix Differential Calculus with Applications in Statistics and Econometrics (John Wiley & Sons, New York).Crossref, Google Scholar
[43] Mei S, Montanari A, Nguyen PM (2018) A mean field view of the landscape of two-layer neural networks. Proc. Natl. Acad. Sci. USA 115(33):E7665–E7671.Crossref, Google Scholar
[44] Molchanov I, Zuyev S (2000) Tangent sets in the space of measures: With applications to variational analysis. J. Math. Anal. Appl. 249(2):539–552.Crossref, Google Scholar
[45] Molchanov I, Zuyev S (2002) Steepest descent algorithms in a space of measures. Statist. Comput. 12(2):115–123. Crossref, Google Scholar
[46] Myat A, Song KJ, Rea T (2018) Out-of-hospital cardiac arrest: Current concepts. Lancet 391(10124):970–979.Crossref, Google Scholar
[47] Nesterov Y (2004) Introductory Lectures on Convex Optimization A Basic Course, Applied Optimization, 1st ed., vol. 87 (Springer, New York).Crossref, Google Scholar
[48] Nesterov Y (2018) Lectures on Convex Optimization, Springer Optimization and Its Applications, vol. 137 (Springer, Cham, Switzerland).Crossref, Google Scholar
[49] Okabe A (2000) Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Wiley Series in Probability and Statistics, 2nd ed. (Wiley, Chichester, UK).Crossref, Google Scholar
[50] Pokutta S (2023) The Frank-Wolfe algorithm: A short introduction. Jahresbericht der Deutschen Mathematiker-Vereinigung 126(1):3–35.Google Scholar
[51] Pukelsheim F (1983) Optimal Design of Experiments (Wiley, New York).Google Scholar
[52] Rao M, Chen Y, Vemuri BC, Wang F (2004) Cumulative residual entropy: A new measure of information. IEEE Trans. Inform. Theory 50(6):1220–1228.Crossref, Google Scholar
[53] Reddi SJ, Sra S, Póczós B, Smola A (2016) Stochastic Frank-Wolfe methods for nonconvex optimization. 2016 54th Annual Allerton Conf. Comm. Control Comput. (Allerton) (IEEE), 1244–1251.Google Scholar
[54] Santambrogio F (2015) Optimal Transport for Applied Mathematicians, Progress in Nonlinear Differential Equations and Their Applications, vol. 87 (Birkhäuser, Cham, Switzerland).Crossref, Google Scholar
[55] Shao X (2015) Self-normalization for time series: A review of recent developments. J. Amer. Statist. Assoc. 110(512):1797–1817.Crossref, Google Scholar
[56] Shapiro A, Dentcheva D, Ruszczynski A (2009) Lectures on Stochastic Programming: Modeling and Theory (Society for Industrial and Applied Mathematics, Philadelphia).Crossref, Google Scholar
[57] Ulbrich S, Ziems JC (2017) Adaptive multilevel trust-region methods for time-dependent PDE-constrained optimization. Portugaliae Mathematica 74(1):37–67.Crossref, Google Scholar
[58] van den Berg PL, Henderson SG, Jagtenberg CJ, Li H (2025) Modeling the impact of community first responders. Management Sci. 71(2):992–1008.Link, Google Scholar
[59] Yan Y, Wang K, Rigollet P (2024) Learning Gaussian mixtures using the Wasserstein–Fisher–Rao gradient flow. Ann. Statist. 52(4):1774–1795.Crossref, Google Scholar

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Metrics

Information

Received:June 28, 2024
Accepted:July 22, 2025
Published Online:September 23, 2025

Cite as

Di Yu, Shane G. Henderson, Raghu Pasupathy (2025) Deterministic and Stochastic Frank–Wolfe Recursion on Probability Spaces. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2024.0584

Keywords

Acknowledgments

The authors thank the editorial team for helpful reports that improved the content and exposition of this paper.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Deterministic and Stochastic Frank–Wolfe Recursion on Probability Spaces

References

Articles In Advance

Article Information

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News