Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

Kayhan Behdin
Corresponding Author
Kayhan Behdin
[email protected]
https://orcid.org/0000-0003-3482-0421
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Wenyu Chen
Wenyu Chen
[email protected]
https://orcid.org/0009-0007-7952-7735
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author
,
Rahul Mazumder
Rahul Mazumder
[email protected]
https://orcid.org/0000-0003-1384-9743
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139; and Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
Search for more papers by this author

Kayhan Behdin

Corresponding Author

Kayhan Behdin

[email protected]

https://orcid.org/0000-0003-3482-0421

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Wenyu Chen

[email protected]

https://orcid.org/0009-0007-7952-7735

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Rahul Mazumder

[email protected]

https://orcid.org/0000-0003-1384-9743

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139; and Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139

Search for more papers by this author

Published Online:9 Apr 2026https://doi.org/10.1287/opre.2023.0381

References

Aktürk MS, Atamtürk A, Gürel S (2009) A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3):187–191.Crossref, Google Scholar
Behdin K, Mazumder R (2021) Sparse PCA: A new scalable estimator based on integer programming. Preprint, submitted September 23, https://arxiv.org/abs/2109.11142.Google Scholar
Belotti P, Kirches C, Leyffer S, Linderoth J, Luedtke J, Mahajan A (2013) Mixed-integer nonlinear optimization. Acta Numerica 22:1–131. Crossref, Google Scholar
Bertsimas D, Parys BV (2020) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Ann. Statist. 48(1):300–323.Crossref, Google Scholar
Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Ann. Statist. 44(2):813–852.Crossref, Google Scholar
Bertsimas D, Lamperski J, Pauphilet J (2020) Certifiably optimal sparse inverse covariance estimation. Math. Programming 184(1):491–530.Crossref, Google Scholar
Besag J (1975) Statistical analysis of non-lattice data. J. Roy. Statist. Soc. Ser. D Statistician 24(3):179–195.Google Scholar
Cai T, Liu W, Luo X (2011) A constrained ell-1 minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106(494):594–607.Crossref, Google Scholar
Chen W, Mazumder R (2024) Subgradient regularized multivariate convex regression at scale. SIAM J. Optim. 34(3):2350–2377.Crossref, Google Scholar
Dempster AP (1972) Covariance selection. Biometrics 28(1):157–175.Crossref, Google Scholar
Dey SS, Mazumder R, Wang G (2022) Using ℓ1-relaxation and integer programming to obtain dual bounds for sparse PCA. Oper. Res. 70(3):1914–1932.Link, Google Scholar
Dong H, Chen K, Linderoth J (2015) Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. Preprint, submitted October 20, https://arxiv.org/abs/1510.06083.Google Scholar
Fattahi S, Gomez A (2021) Scalable inference of sparsely-changing Gaussian Markov random fields. Adv. Neural Inform. Processing Systems 34:6529–6541.Google Scholar
Frangioni A, Gentile C (2006) Perspective cuts for a class of convex 0–1 mixed integer programs. Math. Programming 106(2):225–236.Crossref, Google Scholar
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.Crossref, Google Scholar
Friedman J, Hastie T, Tibshirani R (2010a) Applications of the lasso and grouped lasso to the estimation of sparse graphical models. Technical report, Stanford University, Stanford, CA.Google Scholar
Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1.Crossref, Google Scholar
Günlük O, Linderoth J (2010) Perspective reformulations of mixed integer nonlinear programs with indicator variables. Math. Programming 124(1):183–205.Crossref, Google Scholar
Hastie T, Tibshirani R, Wainwright M (2015) Statistical Learning with Sparsity: The Lasso and Generalizations, Monographs on Statistics and Applied Probability, vol. 143 (CRC Press, Boca Raton, FL).Google Scholar
Hastie T, Tibshirani R, Friedman JH, Friedman JH (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2 (Springer, Berlin).Crossref, Google Scholar
Hazimeh H, Mazumder R (2020) Fast best subset selection: Coordinate descent and local combinatorial optimization algorithms. Oper. Res. 68(5):1517–1537.Link, Google Scholar
Hazimeh H, Mazumder R, Saab A (2022) Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Math. Programming 196(1–2):347–388.Crossref, Google Scholar
Höfling H, Tibshirani R (2009) Estimation of sparse binary pairwise Markov networks using pseudo-likelihoods. J. Machine Learn. Res. 10(4):883–906. Google Scholar
Khare K, Oh S-Y, Rajaratnam B (2015) A Convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J. Royal Statist. Soc. Series B: Statist. Methodology 77(4):803–825. Crossref, Google Scholar
Markowitz H (1952) Portfolio selection. J. Finance 7(1):77–91.Google Scholar
Mazumder R, Hastie T (2012) The graphical lasso: New insights and alternatives. Electronic J. Statist. 6:2125–2149. Crossref, Google Scholar
Mazumder R, Radchenko P, Dedieu A (2023) Subset selection with shrinkage: Sparse linear modeling when the SNR is low. Oper. Res. 71(1):129–147.Link, Google Scholar
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34(3):1436–1462.Crossref, Google Scholar
Misra S, Vuffray M, Lokhov AY (2020) Information theoretic optimal learning of Gaussian graphical models. Abernethy J, Agarwal S, eds. Proc. Conf. Learn. Theory (PMLR, New York), 2888–2909.Google Scholar
Owen AB (2007) A robust hybrid of lasso and ridge regression. Contemporary Math. 443(7):59–72.Crossref, Google Scholar
Peng J, Wang P, Zhou N, Zhu J (2009) Partial correlation estimation by joint sparse regression models. J. Amer. Statist. Assoc. 104(486):735–746.Crossref, Google Scholar
Ravikumar P, Raskutti G, Wainwright MJ, Yu B (2008) Model selection in gaussian graphical models: High-dimensional consistency of l1-regularized MLE. Koller D, Schuurmans D, Bengio Y, Bottou L, eds Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 1329–1336.Google Scholar
Reuther A, Kepner J, Byun C, Samsi S, Arcand W, Bestor D, Bergeron B, et al. (2018) Interactive supercomputing on 40,000 cores for machine learning and data analysis. Proc. IEEE High Performance Extreme Comput. Conf. (IEEE, Piscataway, NJ), 1–6.Google Scholar
Rothman AJ, Bickel PJ, Levina E, Zhu J (2008) Sparse permutation invariant covariance estimation. Electronic J. Statist. 2:494–515.Crossref, Google Scholar
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3):475–494.Crossref, Google Scholar
Wainwright MJ (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48 (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
Wang W, Wainwright MJ, Ramchandran K (2010) Information-theoretic bounds on model selection for Gaussian Markov random fields. Proc. IEEE Internat. Sympos. Inform. Theory (IEEE, Piscataway, NJ), 1373–1377.Google Scholar
Wolsey LA, Nemhauser GL (1999) Integer and Combinatorial Optimization, vol. 55 (John Wiley & Sons, New York).Google Scholar
Xie W, Deng X (2020) Scalable algorithms for the sparse ridge regression. SIAM J. Optim. 30(4):3359–3386.Crossref, Google Scholar

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:July 13, 2023
Accepted:February 24, 2026
Published Online:April 09, 2026

Cite as

Kayhan Behdin, Wenyu Chen, Rahul Mazumder (2026) Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives. Operations Research 0(0).

https://doi.org/10.1287/opre.2023.0381

Keywords

Acknowledgments

Wenyu Chen and Kayhan Behdin contributed to the work when they were PhD students at the MIT Operations Research Center. The authors thank MIT SuperCloud (Reuther et al. 2018) for partially providing the computational resources for this work.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News