Sparse PCA with Multiple Components

Published Online:https://doi.org/10.1287/opre.2023.0598

References

  • Ahmadi AA, Dash S, Hall G (2017) Optimization over structured subsets of positive semidefinite matrices via column generation. Discrete Optim. 24:129–151.CrossrefGoogle Scholar
  • Alizadeh F, Goldfarb D (2003) Second-order cone programming. Math. Programming 95(1):3–51.CrossrefGoogle Scholar
  • Amini AA, Wainwright MJ (2008) High-dimensional analysis of semidefinite relaxations for sparse principal components. Proc. 2008 IEEE Internat. Sympos. Inform. Theory (IEEE, New York), 2454–2458.Google Scholar
  • Asteris M, Papailiopoulos D, Kyrillidis A, Dimakis AG (2015) Sparse PCA via bipartite matchings. Cortes C, Lawrence N, Lee DD, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems 28 NIPS 2015 (MIT Press, Cambridge, MA), 766–774.Google Scholar
  • Avellaneda M, Lee JH (2010) Statistical arbitrage in the US equities market. Quant. Finance 10(7):761–782.CrossrefGoogle Scholar
  • Barker G, Carlson D (1975) Cones of diagonally dominant matrices. Pacific J. Math. 57(1):15–32.CrossrefGoogle Scholar
  • Behdin K, Mazumder R (2026) Sparse PCA: A new scalable estimator based on integer programming. Institute of Mathematical Statistics, https://imstat.org/journals-and-publications/annals-of-statistics/annals-of-statistics-future-papers/.Google Scholar
  • Benidis K, Sun Y, Babu P, Palomar DP (2016) Orthogonal sparse PCA and covariance estimation via Procrustes reformulation. IEEE Trans. Signal Processing 64(23):6211–6226.CrossrefGoogle Scholar
  • Berk L, Bertsimas D (2019) Certifiably optimal sparse principal component analysis. Math. Programming Comput. 11(3):381–420.CrossrefGoogle Scholar
  • Berthet Q, Rigollet P (2013) Optimal detection of sparse principal components in high dimension. Ann. Statist. 41(4):1780–1815.CrossrefGoogle Scholar
  • Bertsekas DP (1996) Constrained Optimization and Lagrange Multiplier Methods (Athena Scientific, Nashua, NH).Google Scholar
  • Bertsimas D, Cory-Wright R (2020) On polyhedral and second-order cone decompositions of semidefinite optimization problems. Oper. Res. Lett. 48(1):78–85.CrossrefGoogle Scholar
  • Bertsimas D, Kitane DL (2023) Sparse PCA: A geometric approach. J. Machine Learn. Res. 24(32):1–33.Google Scholar
  • Bertsimas D, Cory-Wright R, Pauphilet J (2021) A unified approach to mixed-integer optimization problems with logical constraints. SIAM J. Optim. 31(3):2340–2367.CrossrefGoogle Scholar
  • Bertsimas D, Cory-Wright R, Pauphilet J (2022a) Mixed-projection conic optimization: A new paradigm for modeling rank constraints. Oper. Res. 70(6):3321–3344.LinkGoogle Scholar
  • Bertsimas D, Cory-Wright R, Pauphilet J (2022b) Solving large-scale sparse PCA to certifiable (near) optimality. J. Machine Learn. Res. 23(13):1–35.Google Scholar
  • Boutsidis C, Drineas P, Magdon-Ismail M (2011) Sparse features for PCA-like linear regression. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K, eds. Adv. Neural Inform. Processing Systems 24 NIPS 2011 (Curran Associates, Red Hook, NY), 2285–2293.Google Scholar
  • Bresler G, Park SM, Persu M (2018) Sparse PCA from sparse linear regression. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Adv. Neural Inform. Processing Systems 31 NeurIPS 2018 (Curran Associates, Red Hook, NY).Google Scholar
  • Bühler T (2014) A flexible framework for solving constrained ratio problems in machine learning. PhD thesis, Saarland University, Saarbrücken, Germany.Google Scholar
  • d’Aspremont A, Bach F, El Ghaoui L (2008) Optimal solutions for sparse principal component analysis. J. Machine Learn. Res. 9(7):1269–1294.Google Scholar
  • d’Aspremont A, El Ghaoui L, Jordan MI, Lanckriet GR (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3):434–448.CrossrefGoogle Scholar
  • Del Pia A (2023) Sparse PCA on fixed-rank matrices. Math. Programming 198(1):139–157.CrossrefGoogle Scholar
  • Deshpande Y, Montanari A (2014a) Information-theoretically optimal sparse PCA. 2014 IEEE Internat. Sympos. Inform. Theory (IEEE, New York), 2197–2201.Google Scholar
  • Deshpande Y, Montanari A (2014b) Sparse PCA via covariance thresholding. Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KO, eds. Adv. Neural Inform. Processing Systems 27 NIPS 2014 (MIT Press, Cambridge, MA), 334–342.Google Scholar
  • Deshpande Y, Montanari A (2016) Sparse PCA via covariance thresholding. J. Machine Learn. Res. 17(141):1–41.Google Scholar
  • Dey SS, Mazumder R, Wang G (2021) Using ℓ1-relaxation and integer programming to obtain dual bounds for sparse PCA. Oper. Res. 70(3):1914–1932.LinkGoogle Scholar
  • Dey SS, Molinaro M, Wang G (2022) Solving sparse principal component analysis with global support. Math. Programming 199:421–459.CrossrefGoogle Scholar
  • Ding Y, Kunisky D, Wein AS, Bandeira AS (2024) Subexponential-time algorithms for sparse PCA. Foundations Comput. Math. 24:865–914.CrossrefGoogle Scholar
  • Eckart C, Young G (1936) The approximation of one matrix by another of lower rank. Psychometrika 1(3):211–218.CrossrefGoogle Scholar
  • Fan J, Liao Y, Wang W (2016) Projected principal component analysis in factor models. Ann. Statist. 44(1):219–254.CrossrefGoogle Scholar
  • Fisher ML (1981) The Lagrangian relaxation method for solving integer programming problems. Management Sci. 27(1):1–18.LinkGoogle Scholar
  • Gally T, Pfetsch ME (2016) Computing restricted isometry constants via mixed-integer semidefinite programming. Preprint, submitted April 5, https://optimization-online.org/2016/04/5395/.Google Scholar
  • Garey MR, Johnson DS (1979) Computers and Intractability: A Guide to the Theory of NP-Completeness (W. H. Freeman & Co Ltd, New York).Google Scholar
  • Hein M, Bühler T (2010) An inverse power method for nonlinear eigenproblems with applications in 1-spectral clustering and sparse PCA. Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, eds. Adv. Neural Inform. Processing Systems 23 NIPS 2010 (Curran Associates, Red Hook, NY), 847–855.Google Scholar
  • Hopcroft JE, Karp RM (1973) An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM J. Comput. 2(4):225–231.CrossrefGoogle Scholar
  • Horn RA, Johnson CR (1985) Matrix Analysis (Cambridge University Press, New York).CrossrefGoogle Scholar
  • Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J. Ed. Psych. 24(6):417–441.CrossrefGoogle Scholar
  • Johnstone IM, Lu AY (2009) On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104(486):682–693.CrossrefGoogle Scholar
  • Jolliffe IT, Trendafilov NT, Uddin M (2003) A modified principal component technique based on the LASSO. J. Comput. Graphical Statist. 12(3):531–547.CrossrefGoogle Scholar
  • Journée M, Nesterov Y, Richtárik P, Sepulchre R (2010) Generalized power method for sparse principal component analysis. J. Machine Learn. Res. 11(15):517–553.Google Scholar
  • Kim J, Tawarmalani M, Richard JPP (2022) Convexification of permutation-invariant sets and an application to sparse principal component analysis. Math. Oper. Res. 47(4):2547–2584.LinkGoogle Scholar
  • Krauthgamer R, Nadler B, Vilenchik D (2015) Do semidefinite relaxations solve sparse PCA up to the information limit? Ann. Statist. 43(3):1300–1322.CrossrefGoogle Scholar
  • Li Y, Xie W (2024) Beyond symmetry: Best submatrix selection for the sparse truncated SVD. Math. Programming 208:1–50.CrossrefGoogle Scholar
  • Li Y, Xie W (2025) Exact and approximation algorithms for sparse PCA. INFORMS J. Comput. 37(3):582–602.LinkGoogle Scholar
  • Lu Z, Zhang Y (2012) An augmented Lagrangian approach for sparse principal component analysis. Math. Programming 135(1):149–193.CrossrefGoogle Scholar
  • Lubin M, Vielma JP, Zadik I (2022) Mixed-integer convex representability. Math. Oper. Res. 47(1):720–749.LinkGoogle Scholar
  • Mackey L (2008) Deflation methods for sparse PCA. Koller D, Schuurmans D, Bengio Y, Bottou L, eds. Adv. Neural Inform. Processing Systems 21 NIPS 2008 (Curran Associates, Red Hook, NY), 1017–1024.Google Scholar
  • Marshall AW, Olkin I (1979) Inequalities: Theory of Majorization and Its Applications (Academic Press, New York).Google Scholar
  • Naikal N, Yang AY, Sastry SS (2011) Informative feature selection for object recognition via sparse PCA. Proc. 2011 Internat. Conf. Comput. Vision (IEEE Computer Society, Washington, DC), 818–825.Google Scholar
  • Nocedal J, Wright SJ (2006) Numerical Optimization (Springer, New York).Google Scholar
  • Overton ML, Womersley RS (1992) On the sum of the largest eigenvalues of a symmetric matrix. SIAM J. Matrix Anal. Appl. 13(1):41–45.CrossrefGoogle Scholar
  • Pearson K (1901) LIII. On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos. Magazine J. Sci. 2(11):559–572.CrossrefGoogle Scholar
  • Probel CJ, Tropp JA (2011) Large-scale PCA with sparsity constraints. Technical report no. 2011-02, California Institute of Technology, Pasadena.Google Scholar
  • Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C (2022) Interpretable machine learning: Fundamental principles and 10 grand challenges. Statist. Surveys 16:1–85.CrossrefGoogle Scholar
  • Tan KM, Petersen A, Witten D (2014) Classification of RNA-seq data. Datta S, Nettleton D, eds. Statistical Analysis of Next Generation Sequencing Data (Springer, Cham, Switzerland), 219–246.CrossrefGoogle Scholar
  • Tropp JA, Yurtsever A, Udell M, Cevher V (2017) Practical sketching algorithms for low-rank matrix approximation. SIAM J. Matrix Anal. Appl. 38(4):1454–1485.CrossrefGoogle Scholar
  • Udell M, Horn C, Zadeh R, Boyd S (2016) Generalized low rank models. Foundations Trends Machine Learn. 9(1):1–118.CrossrefGoogle Scholar
  • Vu VQ, Lei J (2013) Minimax sparse principal subspace estimation in high dimensions. Ann. Statist. 41(6):2905–2947.CrossrefGoogle Scholar
  • Vu VQ, Cho J, Lei J, Rohe K (2013) Fantope projection and selection: A near-optimal convex relaxation of sparse PCA. Burges CJ, Bottou L, Welling M, Ghahramani Z, Weinberger K, eds. Adv. Neural Inform. Processing Systems 26 NIPS 2013 (Curran Associates, Red Hook, NY), 2670–2678.Google Scholar
  • Wang J, Dey SS, Xie Y (2023) Variable selection for kernel two-sample tests. Preprint, submitted February 15, https://arxiv.org/abs/2302.07415.Google Scholar
  • Witten DM, Tibshirani R, Hastie T (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3):515–534.CrossrefGoogle Scholar
  • Yuan XT, Zhang T (2013) Truncated power method for sparse eigenvalue problems. J. Machine Learn. Res. 14(28):899–925.Google Scholar
  • Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J. Comput. Graphical Statist. 15(2):265–286.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.