Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

Published Online:https://doi.org/10.1287/opre.2023.0381

References

  • Aktürk MS, Atamtürk A, Gürel S (2009) A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3):187–191.CrossrefGoogle Scholar
  • Behdin K, Mazumder R (2021) Sparse PCA: A new scalable estimator based on integer programming. Preprint, submitted September 23, https://arxiv.org/abs/2109.11142.Google Scholar
  • Belotti P, Kirches C, Leyffer S, Linderoth J, Luedtke J, Mahajan A (2013) Mixed-integer nonlinear optimization. Acta Numerica 22:1–131. CrossrefGoogle Scholar
  • Bertsimas D, Parys BV (2020) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Ann. Statist. 48(1):300–323.CrossrefGoogle Scholar
  • Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Ann. Statist. 44(2):813–852.CrossrefGoogle Scholar
  • Bertsimas D, Lamperski J, Pauphilet J (2020) Certifiably optimal sparse inverse covariance estimation. Math. Programming 184(1):491–530.CrossrefGoogle Scholar
  • Besag J (1975) Statistical analysis of non-lattice data. J. Roy. Statist. Soc. Ser. D Statistician 24(3):179–195.Google Scholar
  • Cai T, Liu W, Luo X (2011) A constrained ell-1 minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106(494):594–607.CrossrefGoogle Scholar
  • Chen W, Mazumder R (2024) Subgradient regularized multivariate convex regression at scale. SIAM J. Optim. 34(3):2350–2377.CrossrefGoogle Scholar
  • Dempster AP (1972) Covariance selection. Biometrics 28(1):157–175.CrossrefGoogle Scholar
  • Dey SS, Mazumder R, Wang G (2022) Using ℓ1-relaxation and integer programming to obtain dual bounds for sparse PCA. Oper. Res. 70(3):1914–1932.LinkGoogle Scholar
  • Dong H, Chen K, Linderoth J (2015) Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. Preprint, submitted October 20, https://arxiv.org/abs/1510.06083.Google Scholar
  • Fattahi S, Gomez A (2021) Scalable inference of sparsely-changing Gaussian Markov random fields. Adv. Neural Inform. Processing Systems 34:6529–6541.Google Scholar
  • Frangioni A, Gentile C (2006) Perspective cuts for a class of convex 0–1 mixed integer programs. Math. Programming 106(2):225–236.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2010a) Applications of the lasso and grouped lasso to the estimation of sparse graphical models. Technical report, Stanford University, Stanford, CA.Google Scholar
  • Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1.CrossrefGoogle Scholar
  • Günlük O, Linderoth J (2010) Perspective reformulations of mixed integer nonlinear programs with indicator variables. Math. Programming 124(1):183–205.CrossrefGoogle Scholar
  • Hastie T, Tibshirani R, Wainwright M (2015) Statistical Learning with Sparsity: The Lasso and Generalizations, Monographs on Statistics and Applied Probability, vol. 143 (CRC Press, Boca Raton, FL).Google Scholar
  • Hastie T, Tibshirani R, Friedman JH, Friedman JH (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2 (Springer, Berlin).CrossrefGoogle Scholar
  • Hazimeh H, Mazumder R (2020) Fast best subset selection: Coordinate descent and local combinatorial optimization algorithms. Oper. Res. 68(5):1517–1537.LinkGoogle Scholar
  • Hazimeh H, Mazumder R, Saab A (2022) Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Math. Programming 196(1–2):347–388.CrossrefGoogle Scholar
  • Höfling H, Tibshirani R (2009) Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J. Machine Learn. Res. 10(4):883–906. Google Scholar
  • Khare K, Oh S-Y, Rajaratnam B (2015) A Convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J. Royal Statist. Soc. Series B: Statist. Methodology 77(4):803–825. CrossrefGoogle Scholar
  • Markowitz H (1952) Portfolio selection. J. Finance 7(1):77–91.Google Scholar
  • Mazumder R, Hastie T (2012) The graphical lasso: New insights and alternatives. Electronic J. Statist. 6:2125–2149. CrossrefGoogle Scholar
  • Mazumder R, Radchenko P, Dedieu A (2023) Subset selection with shrinkage: Sparse linear modeling when the SNR is low. Oper. Res. 71(1):129–147.LinkGoogle Scholar
  • Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34(3):1436–1462.CrossrefGoogle Scholar
  • Misra S, Vuffray M, Lokhov AY (2020) Information theoretic optimal learning of Gaussian graphical models. Abernethy J, Agarwal S, eds. Proc. Conf. Learn. Theory (PMLR, New York), 2888–2909.Google Scholar
  • Owen AB (2007) A robust hybrid of lasso and ridge regression. Contemporary Math. 443(7):59–72.CrossrefGoogle Scholar
  • Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J. Amer. Statist. Assoc. 104(486):735–746.CrossrefGoogle Scholar
  • Ravikumar P, Raskutti G, Wainwright MJ, Yu B (2008) Model selection in gaussian graphical models: High-dimensional consistency of l1-regularized MLE. Koller D, Schuurmans D, Bengio Y, Bottou L, eds Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1329–1336.Google Scholar
  • Reuther A, Kepner J, Byun C, Samsi S, Arcand W, Bestor D, Bergeron B, et al. (2018) Interactive supercomputing on 40,000 cores for machine learning and data analysis. Proc. IEEE High Performance Extreme Comput. Conf. (IEEE, Piscataway, NJ), 1–6.Google Scholar
  • Rothman AJ, Bickel PJ, Levina E, Zhu J (2008) Sparse permutation invariant covariance estimation. Electronic J. Statist. 2:494–515.CrossrefGoogle Scholar
  • Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3):475–494.CrossrefGoogle Scholar
  • Wainwright MJ (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48 (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Wang W, Wainwright MJ, Ramchandran K (2010) Information-theoretic bounds on model selection for Gaussian Markov random fields. Proc. IEEE Internat. Sympos. Inform. Theory (IEEE, Piscataway, NJ), 1373–1377.Google Scholar
  • Wolsey LA, Nemhauser GL (1999) Integer and Combinatorial Optimization, vol. 55 (John Wiley & Sons, New York).Google Scholar
  • Xie W, Deng X (2020) Scalable algorithms for the sparse ridge regression. SIAM J. Optim. 30(4):3359–3386.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.