Open Access

Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms

Hussein Hazimeh
Corresponding Author
Hussein Hazimeh
[email protected]
https://orcid.org/0000-0003-4501-0678
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;
Search for more papers by this author
,
Rahul Mazumder
Rahul Mazumder
[email protected]
https://orcid.org/0000-0003-1384-9743
Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;MIT Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142;
Search for more papers by this author

Hussein Hazimeh

Corresponding Author

Hussein Hazimeh

[email protected]

https://orcid.org/0000-0003-4501-0678

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;

Search for more papers by this author

Rahul Mazumder

[email protected]

https://orcid.org/0000-0003-1384-9743

Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;MIT Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts 02142;

Search for more papers by this author

Published Online:13 Aug 2020https://doi.org/10.1287/opre.2019.1919

References

Beck A, Eldar YC (2013) Sparsity constrained nonlinear optimization: Optimality conditions and algorithms. SIAM J. Optim. 23(3):1480–1509.Google Scholar
Beck A, Tetruashvili L (2013) On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4):2037–2060.Google Scholar
Bertsekas D (2016) Nonlinear Programming, 3rd ed. (Athena Scientific, Nashua, NH).Google Scholar
Bertsimas D, Van Parys B (2017) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Ann. Statist. 44(2):813–852.Crossref, Google Scholar
Blumensath T, Davies M (2009) Iterative hard thresholding for compressed sensing. Appl. Comput. Harmonic Anal. 27(3):265–274.Crossref, Google Scholar
Breheny P, Huang J (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Statist. 5(1):232–253.Crossref, Google Scholar
Bühlmann P, van de Geer S (2011) Statistics for High-Dimensional Data (Springer, Berlin).Crossref, Google Scholar
Erdman C, Bates N (2014) The U.S. Census Bureau mail return rate challenge: Crowdsourcing to develop a hard-to-count score. Statistics Research Report 2014-08, U.S. Census Bureau, Washington, DC.Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96(456):1348–1360.Crossref, Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1–22.Crossref, Google Scholar
Gamarnik D, Zadik I (2017) High dimensional regression with binary coefficients. estimating squared error and a phase transition. Kale S, Shamir O, eds. Proc. 2017 Conf. Learn. Theory (PMLR, Amsterdam), 948–953.Google Scholar
Greenshtein E (2006) Best subset selection, persistence in high-dimensional statistical learning and optimization under ℓ1 constraint. Ann. Statist. 34(5):2367–2386.Crossref, Google Scholar
Harrison D, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J. Environ. Econom. Management 5(1):81–102.Google Scholar
Hastie T, Tibshirani R, Tibshirani RJ (2017) Extended comparisons of best subset selection, forward stepwise selection, and the Lasso. Working paper, Stanford University, Stanford, CA.Google Scholar
Hastie T, Tibshirani R, Wainwright M (2015) Statistical Learning with Sparsity: The Lasso and Generalizations (CRC Press, Boca Raton, FL).Crossref, Google Scholar
He R, McAuley J (2016) Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. Proc. 25th Internat. Conf. World Wide Web (International World Wide Web Conferences Steering Committee, Geneva, Switzerland), 507–517.Google Scholar
Lu Z (2014) Iterative hard thresholding methods for l0 regularized convex cone programming. Math. Programming 147(1):125–154.Google Scholar
Mazumder R, Radchenko P (2017) The discrete Dantzig selector: Estimating sparse linear models via mixed integer linear optimization. IEEE Trans. Inform. Theory 63(5):3053–3075.Google Scholar
Mazumder R, Friedman JH, Hastie T (2011) Sparsenet: Coordinate descent with nonconvex penalties. J. Amer. Statist. Assoc. 106(495):1125–1138.Google Scholar
Mazumder R, Radchenko P, Dedieu A (2017) Subset selection with shrinkage: Sparse linear modeling when the SNR is low. Working paper, Massachusetts Institute of Technology, Cambridge.Google Scholar
Miller A (2002) Subset Selection in Regression (CRC Press, Boca Raton, FL).Crossref, Google Scholar
Natarajan B (1995) Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2):227–234.Crossref, Google Scholar
Nesterov Y (2012) Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2):341–362.Crossref, Google Scholar
Patrascu A, Necoara I (2015) Random coordinate descent methods for ℓ0 regularized convex optimization. IEEE Trans. Automatic Control 60(7):1811–1824.Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: Machine learning in Python. J. Machine Learn. Res. 12(2011):2825–2830.Google Scholar
Raskutti G, Wainwright M, Yu B (2011) Minimax rates of estimation for high-dimensional linear regression over ℓq-balls. IEEE Trans. Inform. Theory 57(10):6976–6994.Crossref, Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58(1):267–288.Crossref, Google Scholar
Tibshirani RJ (2013) The lasso problem and uniqueness. Electronic J. Statist. 7:1456–1490.Crossref, Google Scholar
Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J. Roy. Statist. Soc. Ser. B 74(2):245–266.Crossref, Google Scholar
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3):475–494.Crossref, Google Scholar
Wainwright MJ (2009) Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. IEEE Trans. Inform. Theory 55(12):5728–5741.Crossref, Google Scholar
Zhang C-H (2010) Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38(2):894–942.Crossref, Google Scholar
Zhang C-H, Zhang T (2012) A general theory of concave regularization for high-dimensional sparse estimation problems. Statist. Sci. 27(4):576–593.Crossref, Google Scholar
Zhang Y, Wainwright M, Jordan MI (2014) Lower bounds on the performance of polynomial-time algorithms for sparse linear regression. Proc. 2014 Conf. Learn. Theory (PMLR, Amsterdam), 921–948.Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67(2):301–320.Crossref, Google Scholar

Volume 68, Issue 5

September-October 2020

Pages iii-vi, 1285-1624, C2-C3

Article Information

Supplemental Material

Metrics

Information

Received:April 10, 2018
Accepted:June 30, 2019
Published Online:August 13, 2020

Cite as

Hussein Hazimeh, Rahul Mazumder (2020) Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms. Operations Research 68(5):1517-1537.

https://doi.org/10.1287/opre.2019.1919

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Fast Best Subset Selection: Coordinate Descent and Local Combinatorial Optimization Algorithms

References

Volume 68, Issue 5

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News