A Splicing Approach to Best Subset of Groups Selection

Published Online:https://doi.org/10.1287/ijoc.2022.1241

References

  • Bach FR (2008) Consistency of the group lasso and multiple kernel learning. J. Machine Learn. Res. 9(40):1179–1225.Google Scholar
  • Ben-Haim Z, Eldar YC (2011) Near-oracle performance of greedy block-sparse estimation techniques from noisy measurements. IEEE J. Selected Topics Signal Processing 5(5):1032–1047.CrossrefGoogle Scholar
  • Bertsimas D, Parys BV (2020) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Annals Statist. 48(1):300–323.CrossrefGoogle Scholar
  • Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Annals Statist. 44(2):813–852.CrossrefGoogle Scholar
  • Bertsimas D, Digalakis V Jr, Li ML, Lami OS (2021) Slowly varying regression under sparsity. Preprint, submitted September 1, https://arxiv.org/abs/2102.10773.Google Scholar
  • Breheny P, Huang J (2015) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statist. Comput. 25(2):173–187.CrossrefGoogle Scholar
  • Chen J, Chen Z (2008) Extended bayesian information criteria for model selection with large model spaces. Biometrika 95(3):759–771.CrossrefGoogle Scholar
  • Chiang AP, Beck JS, Yen HJ, Tayeh MK, Scheetz TE, Swiderski RE, Nishimura DY, et al. (2006) Homozygosity mapping with snp arrays identifies trim32, an e3 ubiquitin ligase, as a bardet–biedl syndrome gene (bbs11). Proc. National Acad. Sci. USA 103(16):6287–6292.CrossrefGoogle Scholar
  • Eldar YC, Mishali M (2009) Robust recovery of signals from a structured union of subspaces. IEEE Trans. Inform. Theory 55(11):5302–5316.CrossrefGoogle Scholar
  • Eldar YC, Kuppinger P, Bolcskei H (2010) Block-sparse signals: Uncertainty relations and efficient recovery. IEEE Trans. Signal Processing 58(6):3042–3054.CrossrefGoogle Scholar
  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96(456):1348–1360.CrossrefGoogle Scholar
  • Fan J, Feng Y, Song R (2011) Nonparametric independence screening in sparse ultra-high-dimensional additive models. J. Amer. Statist. Assoc. 106(494):544–557.CrossrefGoogle Scholar
  • Hazimeh H, Mazumder R, Radchenko P (2021a) Grouped variable selection with discrete optimization: Computational and statistical perspectives. Preprint, submitted October 17, https://arxiv.org/abs/2104.07084.Google Scholar
  • Hazimeh H, Mazumder R, Saab A (2021b) Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Math. Programming 1–42.Google Scholar
  • Hu Y, Li C, Meng K, Qin J, Yang X (2017) Group sparse optimization via ℓp,q regularization. J. Machine Learn. Res. 18(30):1–52.Google Scholar
  • Huang J, Zhang T (2010) The benefit of group sparsity. Annals Statist. 38(4):1978–2004.CrossrefGoogle Scholar
  • Huang J, Breheny P, Ma S (2012) A selective review of group selection in high-dimensional models. Statist. Sci. 27(4):481–499.CrossrefGoogle Scholar
  • Huang J, Horowitz JL, Wei F (2010) Variable selection in nonparametric additive models. Annals Statist. 38(4):2282–2313.CrossrefGoogle Scholar
  • Huang J, Jiao Y, Liu Y, Lu X (2018) A constructive approach to l0 penalized regression. J. Machine Learn. Res. 19(10):1–37.Google Scholar
  • Ito K, Kunisch K (2013) A variational approach to sparsity optimization based on lagrange multiplier theory. Inverse Problems 30(1):015001.CrossrefGoogle Scholar
  • Jacob L, Obozinski G, Vert JP (2009) Group lasso with overlap and graph lasso. Proc. 26th Annual Internat. Conf. on Machine Learn.(Association for Computing Machinery, New York), 433–440.Google Scholar
  • Jain P, Rao N, Dhillon IS (2016) Structured sparse regression via greedy hard thresholding. Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 29 (Curran Associates, Inc., New York), 1516–1524.Google Scholar
  • Jenatton R, Audibert JY, Bach F (2011) Structured variable selection with sparsity-inducing norms. J. Machine Learn. Res. 12(84):2777–2824.Google Scholar
  • Jiao Y, Jin B, Lu X (2017) Group sparse recovery via the ℓ0(ℓ2) penalty: Theory and algorithm. IEEE Trans. Signal Processing 65(4):998–1012.CrossrefGoogle Scholar
  • Kiefer J (1953) Sequential minimax search for a maximum. Proc. Amer. Math. Soc. 4(3):502–506.CrossrefGoogle Scholar
  • Meinshausen N, Bühlmann P (2010) Stability selection. J. Royal Statist. Soc. Ser. B 72(4):417–473.CrossrefGoogle Scholar
  • Natarajan BK (1995) Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2):227–234.CrossrefGoogle Scholar
  • Obozinski G, Wainwright MJ, Jordan MI (2011) Support union recovery in high-dimensional multivariate regression. Annals Statist. 39(1):1–47.CrossrefGoogle Scholar
  • Pan W, Xie B, Shen X (2010) Incorporating predictor network in penalized regression with application to microarray data. Biometrics 66(2):474–484.CrossrefGoogle Scholar
  • Pan W, Wang X, Xiao W, Zhu H (2019) A generic sure independence screening procedure. J. Amer. Statist. Assoc. 114(526):928–937.CrossrefGoogle Scholar
  • Peng J, Zhu J, Bergamaschi A, Han W, Noh DY, Pollack JR, Wang P (2010) Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. Annals Appl. Statist. 4(1):53–77.CrossrefGoogle Scholar
  • Qian W, Li W, Sogawa Y, Fujimaki R, Yang X, Liu J (2019) An interactive greedy approach to group sparsity in high dimensions. Technometrics 61(3):409–421.CrossrefGoogle Scholar
  • Scheetz TE, Kim KYA, Swiderski RE, Philp AR, Braun TA, Knudtson KL, Dorrance AM, et al. (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc. National Acad. Sci. USA 103(39):14429–14434.CrossrefGoogle Scholar
  • Schwarz G (1978) Estimating the dimension of a model. Annals Statist. 6(2):461–464.CrossrefGoogle Scholar
  • Shen X, Pan W, Zhu Y, Zhou H (2013) On constrained and regularized high-dimensional regression. Annals Inst. Statist. Math. 65(5):807–832.CrossrefGoogle Scholar
  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. B 58(1):267–288.CrossrefGoogle Scholar
  • Wainwright MJ (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint, vol. 48 (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Wang H, Leng C (2008) A note on adaptive group lasso. Comput. Statist. Data Anal. 52(12):5277–5286.CrossrefGoogle Scholar
  • Wang L, Chen G, Li H (2007) Group SCAD regression analysis for microarray time course gene expression data. Bioinformatics 23(12):1486–1494.CrossrefGoogle Scholar
  • Wei F, Huang J (2010) Consistent group selection in high-dimensional linear regression. Bernoulli 16(4):1369–1384.CrossrefGoogle Scholar
  • Won D, Manzour H, Chaovalitwongse W (2020) Convex optimization for group feature selection in networked data. INFORMS J. Comput. 32(1):182–198.LinkGoogle Scholar
  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J. Royal Statist. Soc. Ser. B Statist. Methodological 68(1):49–67.CrossrefGoogle Scholar
  • Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Annals Statist. 38(2):894–942.CrossrefGoogle Scholar
  • Zhang CH, Huang J (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Annals Statist. 36(4):1567–1594.CrossrefGoogle Scholar
  • Zhao P, Rocha G, Yu B (2009) The composite absolute penalties family for grouped and hierarchical variable selection. Annals Statist. 37(6A):3468–3497.CrossrefGoogle Scholar
  • Zhou Y, Zhu L (2018) Model-free feature screening for ultrahigh dimensional datathrough a modified blum-kiefer-rosenblatt correlation. Statist. Sinica 28(3):1351–1370.Google Scholar
  • Zhu J, Pan W, Zheng W, Wang X (2021) Ball: An r package for detecting distribution difference and association in metric spaces. J. Statist. Software 97(6):1–31.CrossrefGoogle Scholar
  • Zhu J, Wen C, Zhu J, Zhang H, Wang X (2020) A polynomial algorithm for best-subset selection problem. Proc. National Acad. Sci. USA 117(52):33117–33123.CrossrefGoogle Scholar
  • Zhu J, Wang X, Hu L, Huang J, Jiang K, Zhang Y, Lin S, Zhu J(2022) abess: A fast best-subset selection library in python and r. J. Machine Learn. Res. 23(202):1–7.Google Scholar
  • Zou H (2006) The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101(476):1418–1429.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.