Fast Multinomial Logistic Regression with Group Sparsity

Published Online:https://doi.org/10.1287/ijoc.2024.0796

References

  • Agresti A (2015) Foundations of Linear and Generalized Linear Models (John Wiley & Sons, Hoboken, NJ).Google Scholar
  • Baker SG (1994) The multinomial-Poisson transformation. J. Roy. Statist. Soc. Ser. D 43(4):495–504.Google Scholar
  • Bickel PJ, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37(4):1705–1732.CrossrefGoogle Scholar
  • Bishop CM (2006) Pattern Recognition and Machine Learning (Springer, New York).Google Scholar
  • Bishop CM, Bishop H (2023) Deep Learning: Foundations and Concepts (Springer, Cham, Switzerland).Google Scholar
  • Böhning D (1992) Multinomial logistic regression algorithm. Ann. Inst. Statist. Math. 44(1):197–200.CrossrefGoogle Scholar
  • Boyd S, Vandenberghe L (2004) Convex Optimization (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Bühlmann P, van de Geer S (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications (Springer, Berlin, Heidelberg).CrossrefGoogle Scholar
  • Dedieu A (2019) Error bounds for sparse classifiers in high-dimensions. Proc. 22nd Internat. Conf. Artificial Intelligence Statist. (PMLR, New York), 48–56.Google Scholar
  • Dedieu A (2021) Improved error rates for sparse (group) learning with Lipschitz loss functions. Preprint, submitted September 22, https://arxiv.org/abs/1910.08880.Google Scholar
  • Fan J, Guo Y, Wang K (2023) Communication-efficient accurate statistical estimation. J. Amer. Statist. Assoc. 118(542):1000–1010.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1–22.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann. Appl. Statist. 1(2):302–332.CrossrefGoogle Scholar
  • Fu S, Chen P, Ye Z (2023a) Simplex-based proximal multicategory support vector machine. IEEE Trans. Inform. Theory 69(4):2427–2451.CrossrefGoogle Scholar
  • Fu S, Zhang S, Liu Y (2018) Adaptively weighted large-margin angle-based classifiers. J. Multivariate Anal. 166:282–299.CrossrefGoogle Scholar
  • Fu S, Chen P, Liu Y, Ye Z (2023b) Simplex-based multinomial logistic regression with diverging numbers of categories and covariates. Statist. Sinica 33(4):2463–2493.Google Scholar
  • Fu S, He Q, Zhang S, Liu Y (2019) Robust outcome weighted learning for optimal individualized treatment rules. J. Biopharmaceutical Statist. 29(4):606–624.CrossrefGoogle Scholar
  • Fu S, Li S, Yu K, Chen P, Ye Z (2026) Fast multinomial logistic regression with group sparsity. https://doi.org/10.1287/ijoc.2024.0796.cd, https://github.com/INFORMSJoC/2024.0796.Google Scholar
  • Ghaoui LE, Viallon V, Rabbani T (2012) Safe feature elimination for the LASSO and sparse supervised learning problems. Pacific J. Optim. 8(4):667–698.Google Scholar
  • Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Data Mining, Inference and Prediction (Springer, New York).CrossrefGoogle Scholar
  • Hastie T, Tibshirani R, Wainwright M (2015) Statistical Learning with Sparsity: The Lasso and Generalizations (Chapman and Hall/CRC, Boca Raton, FL).CrossrefGoogle Scholar
  • Hosmer DW, Lemeshow S, Sturdivant RX (2013) Applied Logistic Regression (John Wiley & Sons, Hoboken, NJ).CrossrefGoogle Scholar
  • Huang J, Zhang T (2010) The benefit of group sparsity. Ann. Statist. 38(4):1978–2004.CrossrefGoogle Scholar
  • Hunter DR, Lange K (2004) A tutorial on MM algorithms. Amer. Statist. 58(1):30–37.CrossrefGoogle Scholar
  • James G, Witten D, Hastie T, Tibshirani R (2021) An Introduction to Statistical Learning: With Applications in R (Springer, New York).CrossrefGoogle Scholar
  • Jordan MI, Lee JD, Yang Y (2019) Communication-efficient distributed statistical inference. J. Amer. Statist. Assoc. 114(526):668–681.CrossrefGoogle Scholar
  • Krishnapuram B, Carin L, Figueiredo MA, Hartemink AJ (2005) Sparse multinomial logistic regression: Fast algorithms and generalization bounds. IEEE Trans. Pattern Anal. Machine Intelligence 27(6):957–968.CrossrefGoogle Scholar
  • Lai W-T, Chen R-B (2021) A review of Bayesian group selection approaches for linear regression models. Wiley Interdisciplinary Rev. Comput. Statist. 13(4):e1513.CrossrefGoogle Scholar
  • Lemmens A, Gupta S (2020) Managing churn to maximize profits. Marketing Sci. 39(5):956–973.LinkGoogle Scholar
  • Li Y, Lu F, Yin Y (2022) Applying logistic LASSO regression for the diagnosis of atypical Crohn’s disease. Sci. Rep. 12(1):11340.CrossrefGoogle Scholar
  • Liang J, Poon C (2023) Variable screening for sparse online regression. J. Comput. Graphical Statist. 32(1):275–293.CrossrefGoogle Scholar
  • Lipkovich I, Svensson D, Ratitch B, Dmitrienko A (2024) Modern approaches for evaluating treatment effect heterogeneity from clinical trials and observational data. Statist. Med. 43(22):4388–4436.CrossrefGoogle Scholar
  • Lounici K, Pontil M, van de Geer S, Tsybakov AB (2011) Oracle inequalities and optimal inference under group sparsity. Ann. Statist. 39(4):2164–2204.CrossrefGoogle Scholar
  • Lücker F, Timonina-Farkas A, Seifert RW (2025) Balancing resilience and efficiency: A literature review on overcoming supply chain disruptions. Production Oper. Management 34(6):1495–1511.CrossrefGoogle Scholar
  • Meier L, Van De Geer S, Bühlmann P (2008) The group lasso for logistic regression. J. Roy. Statist. Soc. Ser. B 70(1):53–71.CrossrefGoogle Scholar
  • Murphy KP (2022) Probabilistic Machine Learning: An Introduction (MIT Press, Cambridge, MA).Google Scholar
  • Ndiaye E, Fercoq O, Gramfort A, Salmon J (2015) GAP safe screening rules for sparse multi-task and multi-class models. Adv. Neural Inform. Processing Systems 28:811–819.Google Scholar
  • Ndiaye E, Fercoq O, Gramfort, A, Salmon, J (2017) Gap safe screening rules for sparsity enforcing penalties. J. Machine Learn. Res. 18(1):4671–4703.Google Scholar
  • Negahban SN, Ravikumar P, Wainwright MJ, Yu B (2012) A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Statist. Sci. 27(4):538–557.CrossrefGoogle Scholar
  • Nibbering D, Hastie TJ (2022) Multiclass-penalized logistic regression. Comput. Statist. Data Anal. 169:107414.CrossrefGoogle Scholar
  • Okumusoglu BC, Basciftci B, Kocuk B (2024) An integrated predictive maintenance and operations scheduling framework for power systems under failure uncertainty. INFORMS J. Comput. 36(5):1335–1358.LinkGoogle Scholar
  • R Core Team (2023) R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna).Google Scholar
  • Salehi F, Abbasi E, Hassibi B (2019) The impact of regularization on high-dimensional logistic regression. Adv. Neural Inform. Processing Systems 32:12005–12015.Google Scholar
  • Shinn LM, Li Y, Mansharamani A, Auvil LS, Welge ME, Bushell C, Khan NA, et al. (2021) Fecal bacteria as biomarkers for predicting food intake in healthy adults. J. Nutrition 151(2):423–433.CrossrefGoogle Scholar
  • Simon N, Friedman J, Hastie T (2013) A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. Preprint, submitted November 26, https://arxiv.org/abs/1311.6529.Google Scholar
  • Sun Q, Hu J, Ye Z-S (2025) Optimal abort policy for mission-critical systems under imperfect condition monitoring. Oper. Res. 73(5):2396–2416.LinkGoogle Scholar
  • Tan Y, Shenoy PP, Sherwood B, Shenoy C, Gaddy M, Oehlert ME (2024) Bayesian network models for PTSD screening in veterans. INFORMS J. Comput. 36(2):495–509.LinkGoogle Scholar
  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58(1):267–288.CrossrefGoogle Scholar
  • Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J. Roy. Statist. Soc. Ser. B 74(2):245–266.CrossrefGoogle Scholar
  • Tutz G (2011) Regression for Categorical Data (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • Vincent M, Hansen NR (2014) Sparse group lasso and high dimensional multinomial classification. Comput. Statist. Data Anal. 71:771–786.CrossrefGoogle Scholar
  • Wang J, Zhou J, Liu J, Wonka P, Ye J (2014) A safe screening rule for sparse logistic regression. Adv. Neural Inform. Processing Systems 27:1053–1061.Google Scholar
  • Wen C, Li Z, Dong R, Ni Y, Pan W (2023) Simultaneous dimension reduction and variable selection for multinomial logistic regression. INFORMS J. Comput. 35(5):1044–1060.LinkGoogle Scholar
  • Yang Y, Zou H (2015) A fast unified algorithm for solving group-lasso penalize learning problems. Statist. Comput. 25(6):1129–1141.CrossrefGoogle Scholar
  • Yang B, Matos MGD, Ferreira P (2020) The effect of shortening lock-in periods in telecommunication services. MIS Quart. 44(3):1391–1409.CrossrefGoogle Scholar
  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J. Roy. Statist. Soc. Ser. B 68(1):49–67.CrossrefGoogle Scholar
  • Yuan M, Xu Y (2023) Feature screening strategy for non-convex sparse logistic regression with log sum penalty. Inform. Sci. 624:732–747.CrossrefGoogle Scholar
  • Zhang H, Chen S (2021) Concentration inequalities for statistical inference. Commun. Math. Res. 37(1):1–85.CrossrefGoogle Scholar
  • Zhang C, Liu Y (2014) Multicategory angle-based large-margin classification. Biometrika 101(3):625–640.CrossrefGoogle Scholar
  • Zhang C, Liu Y, Wang J, Zhu H (2016) Reinforced angle-based multicategory support vector machines. J. Comput. Graphical Statist. 25(3):806–825.CrossrefGoogle Scholar
  • Zhang C, Pham M, Fu S, Liu Y (2018) Robust multicategory support vector machines using difference convex algorithm. Math. Programming 169(1):277–305.CrossrefGoogle Scholar
  • Zhu J, Hastie T (2004) Classification of gene microarrays by penalized logistic regression. Biostatistics 5(3):427–443.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.