Deterministic and Stochastic Frank–Wolfe Recursion on Probability Spaces

Published Online:https://doi.org/10.1287/moor.2024.0584

References

  • [1] Ambrosio L, Gigli N, Savaré G (2008) Gradient Flows: In Metric Spaces and in the Space of Probability Measures (Birkhäuser, Basel).Google Scholar
  • [2] Andrews I, Gentzkow M, Shapiro JM (2017) Measuring the sensitivity of parameter estimates to estimation moments. Quart. J. Econom. 132(4):1553–1592.CrossrefGoogle Scholar
  • [3] Bartle RG (1976) The Elements of Real Analysis (Wiley, New York).Google Scholar
  • [4] Billingsley P (1999) Convergence of Probability Measures, Wiley Series in Probability and Statistics, 2nd ed. (Wiley, New York).CrossrefGoogle Scholar
  • [5] Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. SIAM Rev. 60(2):223–311.CrossrefGoogle Scholar
  • [6] Boyd N, Schiebinger G, Recht B (2017) The alternating descent conditional gradient method for sparse inverse problems. SIAM J. Optim. 27(2):616–639.CrossrefGoogle Scholar
  • [7] Bredies K, Pikkarainen H (2013) Inverse problems in spaces of measures. ESAIM: Control Optim. Calculus Variations 19(1):190–218.CrossrefGoogle Scholar
  • [8] Bubeck S (2015) Convex optimization: Algorithms and complexity. Foundations Trends Machine Learn. 8(3–4):231–357.CrossrefGoogle Scholar
  • [9] Candès EJ, Fernandez-Granda C (2013) Super-resolution from noisy data. J. Fourier Anal. Appl. 19(6):1229–1254.CrossrefGoogle Scholar
  • [10] Candès EJ, Fernandez-Granda C (2014) Towards a mathematical theory of super-resolution. Comm. Pure Appl. Math. 67(6):906–956.CrossrefGoogle Scholar
  • [11] Canuto C, Urban K (2005) Adaptive optimization of convex functionals in Banach spaces. SIAM J. Numer. Anal. 42(5):2043–2075.CrossrefGoogle Scholar
  • [12] Carrillo JA, Craig K, Patacchini FS (2019) A blob method for diffusion. Calculus Variations Partial Differential Equations 58(2):53.CrossrefGoogle Scholar
  • [13] Carrillo JA, Craig K, Wang L, Wei C (2022) Primal dual methods for Wasserstein gradient flows. Foundations Comput. Math. 22(2):389–443.CrossrefGoogle Scholar
  • [14] Chernozhukov V, Escanciano JC, Ichimura H, Newey WK, Robins JM (2022) Locally robust semiparametric estimation. Econometrica 90(4):1501–1535.CrossrefGoogle Scholar
  • [15] Chernozhukov V, Chetverikov D, Demirer M, Duflo E, Hansen C, Newey W, Robins J (2018) Double/debiased machine learning for treatment and structural parameters. Econometrics J. 21(1):C1–C68. CrossrefGoogle Scholar
  • [16] Chizat L (2022) Convergence rates of gradient methods for convex optimization in the space of measures. Open J. Math. Optim. 3:8.Google Scholar
  • [17] Chizat L (2022) Sparse optimization on measures with over-parameterized gradient descent. Math. Programming 194(1–2):487–532.CrossrefGoogle Scholar
  • [18] Chu C, Blanchet J, Glynn P (2019) Probability functional descent: A unifying perspective on GANs, variational inference, and reinforcement learning. Proc. 36th Internat. Conf. Machine Learn., vol. 97 (PMLR, New York), 1213–1222.Google Scholar
  • [19] Daskin M (1983) A maximum expected covering location model: Formulation, properties, and heuristic solution. Transportation Sci. 17(1):48–70.LinkGoogle Scholar
  • [20] De Castro Y, Gamboa F (2012) Exact reconstruction using Beurling minimal extrapolation. J. Math. Anal. Appl. 395(1):336–354.CrossrefGoogle Scholar
  • [21] Demynov VF, Rubinov AM (1970) Approximate Methods in Optimization Problems (Elsevier, New York).Google Scholar
  • [22] Dunn JC, Harshbarger S (1978) Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2):432–444.CrossrefGoogle Scholar
  • [23] Durrett R (2019) Probability: Theory and Examples, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 49 (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • [24] Eftekhari A, Thompson A (2019) Sparse inverse problems over measures: Equivalence of the conditional gradient and exchange methods. SIAM J. Optim. 29(2):1329–1349. CrossrefGoogle Scholar
  • [25] Fedorov VV, Hackl P (1997) Model-Oriented Design of Experiments, Lecture Notes in Statistics, vol. 125 (Springer, New York).CrossrefGoogle Scholar
  • [26] Fernandez-Granda C (2013) Support detection in super-resolution. Preprint, submitted February 16, https://arxiv.org/abs/1302.3921.Google Scholar
  • [27] Fernholz LT (1983) von Mises Calculus for Statistical Functionals, Lecture Notes in Statistics, vol. 19 (Springer, New York). CrossrefGoogle Scholar
  • [28] Fernholz LT (2011) Functional derivatives in statistics: Asymptotics and robustness. Lovric M, ed. International Encyclopedia of Statistical Science (Springer, Berlin, Heidelberg).Google Scholar
  • [29] Firpo S, Fortin NM, Lemieux T (2009) Unconditional quantile regressions. Econometrica 77(3):953–973.CrossrefGoogle Scholar
  • [30] Frank M, Wolfe P (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.CrossrefGoogle Scholar
  • [31] Futami F, Cui Z, Sato I, Sugiyama M (2019) Bayesian posterior approximation via greedy particle optimization. Proc. AAAI Conf. Artificial Intelligence 33(1):3606–3613.Google Scholar
  • [32] Garreis S, Ulbrich M (2019) An inexact trust-region algorithm for constrained problems in Hilbert space and its application to the adaptive solution of optimal control problems with PDEs. Working paper, Technical University of Munich, Munich, Germany.Google Scholar
  • [33] Geer SA (2000) Empirical Processes in M-Estimation, vol. 6 (Cambridge University Press, Cambridge, UK).Google Scholar
  • [34] Hampel FR (1974) The influence curve and its role in robust estimation. J. Amer. Statist. Assoc. 69(346):383–393.CrossrefGoogle Scholar
  • [35] Henderson SG, van den Berg PL, Jagtenberg CJ, Li H (2022) How should volunteers be dispatched to out-of-hospital cardiac arrest cases? Queueing Systems 100(3–4):437–439.CrossrefGoogle Scholar
  • [36] Huber PJ (1996) Robust Statistical Procedures, CBMS-NSF Regional Conference Series in Applied Mathematics, 2nd ed., vol. 68 (Society for Industrial and Applied Mathematics, Philadelphia).Google Scholar
  • [37] Jin C, Zhang Y, Balakrishnan S, Wainwright MJ, Jordan MI (2016) Local maxima in the likelihood of Gaussian mixture models: Structural results and algorithmic consequences. Advances in Neural Information Processing Systems, 29.Google Scholar
  • [38] Kent C, Blanchet J, Glynn P (2021) Frank-Wolfe methods in probability space. Preprint, submitted May 11, https://arxiv.org/abs/2105.05352.Google Scholar
  • [39] Kiefer J (1974) General equivalence theory for optimum designs (approximate theory). Ann. Statist. 2(5):849–879.CrossrefGoogle Scholar
  • [40] Lacoste-Julien S (2016) Convergence rate of Frank-Wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345.Google Scholar
  • [41] Liu Q, Wang D (2016) Stein variational gradient descent: A general purpose Bayesian inference algorithm. Advances in Neural Information Processing Systems, 29 (Curran Associates Inc., Red Hook, NY).Google Scholar
  • [42] Magnus JR, Neudecker H (2019) Matrix Differential Calculus with Applications in Statistics and Econometrics (John Wiley & Sons, New York).CrossrefGoogle Scholar
  • [43] Mei S, Montanari A, Nguyen PM (2018) A mean field view of the landscape of two-layer neural networks. Proc. Natl. Acad. Sci. USA 115(33):E7665–E7671.CrossrefGoogle Scholar
  • [44] Molchanov I, Zuyev S (2000) Tangent sets in the space of measures: With applications to variational analysis. J. Math. Anal. Appl. 249(2):539–552.CrossrefGoogle Scholar
  • [45] Molchanov I, Zuyev S (2002) Steepest descent algorithms in a space of measures. Statist. Comput. 12(2):115–123. CrossrefGoogle Scholar
  • [46] Myat A, Song KJ, Rea T (2018) Out-of-hospital cardiac arrest: Current concepts. Lancet 391(10124):970–979.CrossrefGoogle Scholar
  • [47] Nesterov Y (2004) Introductory Lectures on Convex Optimization A Basic Course, Applied Optimization, 1st ed., vol. 87 (Springer, New York).CrossrefGoogle Scholar
  • [48] Nesterov Y (2018) Lectures on Convex Optimization, Springer Optimization and Its Applications, vol. 137 (Springer, Cham, Switzerland).CrossrefGoogle Scholar
  • [49] Okabe A (2000) Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Wiley Series in Probability and Statistics, 2nd ed. (Wiley, Chichester, UK).CrossrefGoogle Scholar
  • [50] Pokutta S (2023) The Frank-Wolfe algorithm: A short introduction. Jahresbericht der Deutschen Mathematiker-Vereinigung 126(1):3–35.Google Scholar
  • [51] Pukelsheim F (1983) Optimal Design of Experiments (Wiley, New York).Google Scholar
  • [52] Rao M, Chen Y, Vemuri BC, Wang F (2004) Cumulative residual entropy: A new measure of information. IEEE Trans. Inform. Theory 50(6):1220–1228.CrossrefGoogle Scholar
  • [53] Reddi SJ, Sra S, Póczós B, Smola A (2016) Stochastic Frank-Wolfe methods for nonconvex optimization. 2016 54th Annual Allerton Conf. Comm. Control Comput. (Allerton) (IEEE), 1244–1251.Google Scholar
  • [54] Santambrogio F (2015) Optimal Transport for Applied Mathematicians, Progress in Nonlinear Differential Equations and Their Applications, vol. 87 (Birkhäuser, Cham, Switzerland).CrossrefGoogle Scholar
  • [55] Shao X (2015) Self-normalization for time series: A review of recent developments. J. Amer. Statist. Assoc. 110(512):1797–1817.CrossrefGoogle Scholar
  • [56] Shapiro A, Dentcheva D, Ruszczynski A (2009) Lectures on Stochastic Programming: Modeling and Theory (Society for Industrial and Applied Mathematics, Philadelphia).CrossrefGoogle Scholar
  • [57] Ulbrich S, Ziems JC (2017) Adaptive multilevel trust-region methods for time-dependent PDE-constrained optimization. Portugaliae Mathematica 74(1):37–67.CrossrefGoogle Scholar
  • [58] van den Berg PL, Henderson SG, Jagtenberg CJ, Li H (2025) Modeling the impact of community first responders. Management Sci. 71(2):992–1008.LinkGoogle Scholar
  • [59] Yan Y, Wang K, Rigollet P (2024) Learning Gaussian mixtures using the Wasserstein–Fisher–Rao gradient flow. Ann. Statist. 52(4):1774–1795.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.