Distributionally Robust Inverse Covariance Estimation: The Wasserstein Shrinkage Estimator

Published Online:https://doi.org/10.1287/opre.2020.2076

References

  • Banerjee O, El Ghaoui L, d’Aspremont A (2008) Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Machine Learn. Res. 9(June):485–516.Google Scholar
  • Berge C (1963) Topological Spaces: Including a Treatment of Multi-Valued Functions, Vector Spaces, and Convexity (Dover, Mineola, NY).Google Scholar
  • Bernstein DS (2009) Matrix Mathematics: Theory, Facts, and Formulas (Princeton University Press, Princeton, NJ).CrossrefGoogle Scholar
  • Bertsekas DP (2009) Convex Optimization Theory (Athena Scientific, Belmont, MA).Google Scholar
  • Bien J, Tibshirani RJ (2011) Sparse estimation of a covariance matrix. Biometrika 98(4):807–820.CrossrefGoogle Scholar
  • Blanchet J, Murthy K (2016) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565–600.Google Scholar
  • Blanchet J, Si N (2019) Optimal uncertainty size in distributionally robust inverse covariance estimation. Oper. Res. Lett. 47(6):618–621.Google Scholar
  • Boyd S, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Chun SY, Browne MW, Shapiro A (2018) Modified distribution-free goodness-of-fit test statistic. Psychometrika 83(1):48–66.CrossrefGoogle Scholar
  • Dahl J, Roychowdhury V, Vandenberghe L (2005) Maximum likelihood estimation of Gaussian graphical models: Numerical implementation and topology selection. Working paper, University of California, Los Angeles, Los Angeles.Google Scholar
  • Das A, Sampson AL, Lainscsek C, Muller L, Lin W, Doyle JC, Cash SS, Halgren E, Sejnowski TJ (2017) Interpretation of the precision matrix and its application in estimating sparse brain connectivity during sleep spindles from human electrocorticography recordings. Neural Comput. 29(3):603–642.CrossrefGoogle Scholar
  • Delage E, Ye Y (2010) Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58(3):595–612.LinkGoogle Scholar
  • DeMiguel V, Nogales FJ (2009) Portfolio selection with robust estimation. Oper. Res. 57(3):560–577.LinkGoogle Scholar
  • Dettling M (2004) BagBoosting for tumor classification with gene expression data. Bioinformatics 20(18):3583–3593.CrossrefGoogle Scholar
  • Dey DK, Srinivasan C (1985) Estimation of a covariance matrix under Stein’s loss. Ann. Statist. 13(4):1581–1591.CrossrefGoogle Scholar
  • Du L, Li J, Stoica P (2010) Fully automatic computation of diagonal loading levels for robust adaptive beamforming. IEEE Trans. Aerospace Electronic Systems 46(1):449–458.CrossrefGoogle Scholar
  • Fan J, Fan Y, Lv J (2008) High dimensional covariance matrix estimation using a factor model. J. Econometrics 147(1):186–197.CrossrefGoogle Scholar
  • Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann. Eugenics 7(2):179–188.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.CrossrefGoogle Scholar
  • Gao R, Kleywegt A (2016) Distributionally robust stochastic optimization with Wasserstein distance. Preprint, submitted April 8, https://arxiv.org/abs/1604.02199.Google Scholar
  • Gao R, Chen X, Kleywegt A (2016) Wasserstein distributional robustness and regularization in statistical learning. Preprint, submitted December 17, https://arxiv.org/abs/1712.06050.Google Scholar
  • Givens CR, Shortt RM (1984) A class of Wasserstein metrics for probability distributions. Michigan Math. J. 31(2):231–240.CrossrefGoogle Scholar
  • Goh J, Sim M (2010) Distributionally robust optimization and its tractable approximations. Oper. Res. 58(4, Part 1):902–917.LinkGoogle Scholar
  • Goto S, Xu Y (2015) Improving mean variance optimization through sparse hedging restrictions. J. Financial Quant. Anal. 50(6):1415–1441.CrossrefGoogle Scholar
  • Haff LR (1991) The variational form of certain Bayes estimators. Ann. Statist. 19(3):1163–1190.CrossrefGoogle Scholar
  • Hastie T, Tibshirani R, Friedman J (2001) The Elements of Statistical Learning (Springer, New York).CrossrefGoogle Scholar
  • Hespanha JP (2009) Linear Systems Theory (Princeton University Press, Princeton, NJ).Google Scholar
  • Hsieh C-J, Sustik MA, Dhillon IS, Ravikumar P (2014) QUIC: Quadratic approximation for sparse inverse covariance estimation. J. Machine Learn. Res. 15(83):2911–2947.Google Scholar
  • Jagannathan R, Ma T (2003) Risk reduction in large portfolios: Why imposing the wrong constraints helps. J. Finance 58(4):1651–1683.CrossrefGoogle Scholar
  • James W, Stein C (1961) Estimation with quadratic loss. Proc. 4th Berkeley Sympos. Math. Statist. Probab., Vol. 1: Contributions to the Theory of Statistics (University of California Press, Berkeley), 361–379.Google Scholar
  • Lauritzen SL (1996) Graphical Models (Oxford University Press, Oxford, UK).CrossrefGoogle Scholar
  • Ledoit O (2004a) A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88(2):365–411.CrossrefGoogle Scholar
  • Ledoit O (2004b) Honey, I shrunk the sample covariance matrix. J. Portfolio Management 30(4):110–119.CrossrefGoogle Scholar
  • Ledoit O (2012) Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Statist. 40(2):1024–1060.CrossrefGoogle Scholar
  • Ledoit O, Wolf M (2003) Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. J. Empirical Finance 10(5):603–621.CrossrefGoogle Scholar
  • Löfberg J (2004) YALMIP: A toolbox for modeling and optimization in MATLAB. 2004 IEEE Internat. Conf. Robotics Automation (IEEE, Piscataway, NJ), 284–289.Google Scholar
  • Markowitz H (1952) Portfolio selection. J. Finance 7(1):77–91.Google Scholar
  • Mohajerin Esfahani P, Kuhn D (2017) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1–2):115–166.CrossrefGoogle Scholar
  • Murphy KP (2013) Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, MA).Google Scholar
  • Nocedal J, Wright SJ (2006) Numerical Optimization (Springer, New York).Google Scholar
  • Nguyen VA, Shafieezadeh-Abadeh S, Kuhn D, Mohajerin Esfahani P (2022) Bridging Bayesian and minimax mean square error estimation via Wasserstein distributionally robust optimization. Math. Oper. Res. Forthcoming.Google Scholar
  • Oztoprak F, Nocedal J, Rennie S, Olsen PA (2012) Newton-like methods for sparse inverse covariance estimation. Pereira F, Burges CJC, Bottou L, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 25 (Curran Associates, Red Hook, NY), 755–763.Google Scholar
  • Pan VY, Chen ZQ (1999) The complexity of the matrix eigenproblem. Proc. 31st Annual ACM Sympos. Theory Comput. (ACM, New York), 507–516.Google Scholar
  • Perlman MD (2007) STAT 542: Multivariate statistical analysis. Lecture notes, University of Washington, Seattle. http://courses.washington.edu/stat512/542Notes2007.pdf.Google Scholar
  • Ribes A, Azaïs J-M, Planton S (2009) Adaptation of the optimal fingerprint method for climate change detection using a well-conditioned covariance matrix estimate. Climate Dynam. 33(5):707–722.CrossrefGoogle Scholar
  • Rippl T, Munk A, Sturm A (2016) Limit laws of the empirical Wasserstein distance: Gaussian distributions. J. Multivariate Anal. 151(October):90–109.CrossrefGoogle Scholar
  • Schäfer J, Strimmer K (2005) A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statist. Appl. Genetics Molecular Biol. 4(1):Article 32.Google Scholar
  • Shafieezadeh-Abadeh S, Kuhn D, Mohajerin Esfahani P (2017) Regularization via mass transportation. Preprint, submitted October 27, https://arxiv.org/abs/1710.10016.Google Scholar
  • Shafieezadeh-Abadeh S, Mohajerin Esfahani P, Kuhn D (2015) Distributionally robust logistic regression. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates, Red Hook, NY), 1576–1584.Google Scholar
  • Stein C (1975) Estimation of a covariance matrix. Rietz Lecture, 39th Annual Meeting IMS, Institute of Mathematical Statistics, Cleveland.Google Scholar
  • Stein C (1986) Lectures on the theory of estimation of many parameters. J. Soviet Math. 34(1):1373–1403.CrossrefGoogle Scholar
  • Stevens GVG (1998) On the inverse of the covariance matrix in portfolio analysis. J. Finance 53(5):1821–1827.CrossrefGoogle Scholar
  • Torri G, Giacometti R, Paterlini S (2019) Sparse precision matrices for minimum variance portfolios. Comput. Management Sci. 16(3):375–400.CrossrefGoogle Scholar
  • Touloumis A (2015) Nonparametric Stein-type shrinkage covariance matrix estimators in high-dimensional settings. Comput. Statist. Data Anal. 83(March):251–261.CrossrefGoogle Scholar
  • Tseng P, Yun S (2009) A coordinate gradient descent method for nonsmooth separable minimization. Math. Programming 117(1–2):387–423.CrossrefGoogle Scholar
  • Tütüncü RH, Toh KC, Todd MJ (2003) Solving semidefinite-quadratic-linear programs using SDPT3. Math. Programming 95(2):189–217.CrossrefGoogle Scholar
  • van der Vaart HR (1961) On certain characteristics of the distribution of the latent roots of a symmetric random matrix under general conditions. Ann. Math. Statist. 32(3):864–873.CrossrefGoogle Scholar
  • Wiesel A, Eldar Y, Hero A (2010) Covariance estimation in decomposable Gaussian graphical models. IEEE Trans. Signal Process. 58(3):1482–1492.CrossrefGoogle Scholar
  • Wiesemann W, Kuhn D, Sim M (2014) Distributionally robust convex optimization. Oper. Res. 62(6):1358–1376.LinkGoogle Scholar
  • Won J-H, Lim J, Kim S-J, Rajaratnam B (2013) Condition number regularized covariance estimation. J. Roy. Statist. Soc. Ser. B Statist. Methodol. 75(3):427–450.CrossrefGoogle Scholar
  • Yang R, Berger JO (1994) Estimation of a covariance matrix using the reference prior. Ann. Statist. 22(3):1195–1211.CrossrefGoogle Scholar
  • Zhao C, Guan Y (2018) Data-driven risk-averse stochastic optimization with Wasserstein metric. Oper. Res. Lett. 46(2):262–267.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.