Free Access

An Alternating Manifold Proximal Gradient Method for Sparse Principal Component Analysis and Sparse Canonical Correlation Analysis

Shixiang Chen
Shixiang Chen
http://orcid.org/0000-0002-3261-0714
Department of Industrial and Systems Engineering, Texas A&M University, College Station, Texas 77843-3131;
Search for more papers by this author
,
Shiqian Ma
Corresponding Author
Shiqian Ma
http://orcid.org/0000-0003-1967-1069
Department of Mathematics, University of California, Davis, Davis, California 95616;
Search for more papers by this author
,
Lingzhou Xue
Lingzhou Xue
http://orcid.org/0000-0002-8252-0637
Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802;
Search for more papers by this author
,
Hui Zou
Hui Zou
http://orcid.org/0000-0003-4798-9904
School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455
Search for more papers by this author

Shixiang Chen

http://orcid.org/0000-0002-3261-0714

Department of Industrial and Systems Engineering, Texas A&M University, College Station, Texas 77843-3131;

Search for more papers by this author

Shiqian Ma

Corresponding Author

Shiqian Ma

http://orcid.org/0000-0003-1967-1069

Department of Mathematics, University of California, Davis, Davis, California 95616;

Search for more papers by this author

Lingzhou Xue

http://orcid.org/0000-0002-8252-0637

Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802;

Search for more papers by this author

Hui Zou

http://orcid.org/0000-0003-4798-9904

School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455

Search for more papers by this author

Published Online:24 Jul 2020https://doi.org/10.1287/ijoo.2019.0032

References

Absil PA, Mahony R, Sepulchre R (2009) Optimization Algorithms on Matrix Manifolds (Princeton University Press, Princeton, NJ).Google Scholar
Attouch H, Bolte J, Redont P, Soubeyran A (2010) Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2):438–457.Link, Google Scholar
Baik J, Silverstein JW (2006) Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97(6):1382–1408.Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1):183–202.Google Scholar
Bendory T, Eldar YC, Boumal N (2018) Non-convex phase retrieval from STFT measurements. IEEE Trans. Inform. Theory 64(1):467–484.Google Scholar
Bolte J, Sabach S, Teboulle M (2014) Proximal alternating linearized minimization or nonconvex and nonsmooth problems. Math. Programming 146(1–2):459–494.Google Scholar
Boumal N (2016) Nonconvex phase synchronization. SIAM J. Optim. 26(4):2355–2377.Google Scholar
Boumal N, Absil PA (2011) RTRMC: A Riemannian trust-region method for low-rank matrix completion. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Proc. 24th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 406–414.Google Scholar
Chen M, Gao C, Ren Z, Zhou HH (2013) Sparse CCA via precision adjusted iterative thresholding. Preprint, submitted November 24, https://arxiv.org/abs/1311.6186.Google Scholar
Chen S, Ma S, So AMC, Zhang T (2020) Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim. 30(1):210–239.Google Scholar
Cherian A, Sra S (2017) Riemannian dictionary learning and sparse coding for positive definite matrices. IEEE Trans. Neural Networks Learn. Systems 28(12):2859–2871.Google Scholar
d’Aspremont A (2011) Identifying small mean-reverting portfolios. Quant. Finance 11(3):351–364.Google Scholar
d’Aspremont A, Bach F, Ghaoui LE (2008) Optimal solutions for sparse principal component analysis. J. Machine Learn. Res. 9:1269–1294.Google Scholar
d’Aspremont A, Ghaoui LE, Jordan MI, Lanckriet GRG (2007) A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49(3):434–448.Google Scholar
Erichson NB, Zheng P, Manoharz K, Brunton SL, Kutz JN, Aravkin AY (2020) Sparse principal component analysis via variable projection. SIAM J. Appl. Math. Forthcoming.Google Scholar
Facchinei F, Pang J (2007) Finite-Dimensional Variational Inequalities and Complementarity Problems (Springer Science & Business Media, Berlin).Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33(1):1–22.Google Scholar
Gao C, Ma Z, Zhou HH (2017) Sparse CCA: Adaptive estimation and computational barriers. Ann. Statist. 45(5):2074–2101.Google Scholar
Hardoon DR, Shawe-Taylor J (2011) Sparse canonical correlation analysis. Machine Learn. 83(3):331–353.Google Scholar
Hiriart-Urruty JB, Strodiot JJ, Nguyen VH (1984) Generalized Hessian matrix and second-order optimality conditions for problems with C1,1 data. Appl. Math. Optim. 11(1):43–56.Google Scholar
Hotelling H (1936) Relations between two sets of variates. Biometrika 28(3–4):321–377.Google Scholar
Huang W, Hand P (2018) Blind deconvolution by a steepest descent algorithm on a quotient manifold. SIAM J. Imaging Sci. 11(4):2757–2785.Google Scholar
Jolliffe I, Trendafilov N, Uddin M (2003) A modified principal component technique based on the LASSO. J. Comput. Graph. Statist. 12(3):531–547.Google Scholar
Journee M, Nesterov Y, Richtarik P, Sepulchre R (2010) Generalized power method for sparse principal component analysis. J. Machine Learn. Res. 11(February):517–553.Google Scholar
Li X, Sun D, Toh KC (2018) A highly efficient semismooth Newton augmented Lagrangian method for solving lasso problems. SIAM J. Optim. 28(1):433–458.Google Scholar
Liu H, Yue MC, So AMC (2017) On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM J. Optim. 27(4):2426–2446.Google Scholar
Lu Z, Zhang Y (2012) An augmented Lagrangian approach for sparse principal component analysis. Math. Programming 135:149–193.Google Scholar
Ma S (2013) Alternating direction method of multipliers for sparse principal component analysis. J. Oper. Res. Soc. China 1(2):253–274.Google Scholar
Magnus JR, Neudecker H (2018) Matrix Differential Calculus with Applications in Statistics and Econometrics (Wiley, Hoboken, NJ).Google Scholar
Mifflin R (1977) Semismooth and semiconvex functions in constrained optimization. SIAM J. Control Optim. 15(6):959–972.Google Scholar
Moghaddam B, Weiss Y, Avidan S (2005) Spectral bounds for sparse PCA: Exact and greedy algorithms. Weiss Y, Scholkopf B, Platt JC, eds. Proc. 18th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 915–922.Google Scholar
Nadler B (2008) Finite sample approximation results for principal component analysis: A matrix perturbation approach. Ann. Statist. 36(6):2791–2817.Google Scholar
Parkhomenko E, Tritchler D, Beyene J (2009) Sparse canonical correlation analysis with application to genomic data integration. Statist. Appl. Genetics Molecular Biol. 8(1):1–34.Google Scholar
Paul D (2007) Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica 17:1617–1642.Google Scholar
Pearson K (1901) LIII. On lines and planes of closest fit to systems of points in space. London Edinburgh Dublin Philos. Magazine J. Sci. 2(11):559–572.Google Scholar
Qi H, Sun D (2011) An augmented Lagrangian dual approach for the H-weighted nearest correlation matrix problem. IMA J. Numer. Anal. 31(2):491–511.Google Scholar
Qi L, Sun J (1993) A nonsmooth version of Newton’s method. Math. Programming 58:353–367.Google Scholar
Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low rank matrix approximation. J. Multivariate Anal. 99(6):1015–1034.Google Scholar
Solodov MV, Svaiter BF (1998) A globally convergent inexact Newton method for systems of monotone equations. Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods (Springer, Dordrecht), 355–369.Google Scholar
Sun J, Qu Q, Wright J (2017) Complete dictionary recovery over the sphere I: Overview and the geometric picture. IEEE Trans. Inform. Theory 63(2):853–884.Google Scholar
Sun J, Qu Q, Wright J (2018) A geometrical analysis of phase retrieval. Foundations Comput. Math. 18(5):1131–1198.Google Scholar
Suo X, Minden V, Nelson B, Tibshirani R, Saunders M (2017) Sparse canonical correlation analysis. Preprint, submitted May 30, https://arxiv.org/abs/1705.10865.Google Scholar
Ulbrich M (2011) Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces, vol. 11 (SIAM, Philadelphia).Google Scholar
Vandereycken B (2013) Low-rank matrix completion by Riemannian optimization. SIAM J. Optim. 23(2):1214–1236.Google Scholar
Vu VQ, Cho J, Lei J, Rohe K (2013) Fantope projection and selection: A near-optimal convex relaxation of sparse PCA. Burges CJC, Bouttou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Proc. 26th Internat. Conf. Neural Inform. Processing Systems (Curran Associates, Red Hook, NY), 2670–2678.Google Scholar
Wang C, Sun D, Toh KC (2010) Solving log-determinant optimization problems by a Newton-CG primal proximal point algorithm. SIAM J. Optim. 20(6):2994–3013.Google Scholar
Wiesel A, Kliger M, Hero AO III (2008) A greedy approach to sparse canonical correlation analysis. Preprint, submitted January 17, arXiv:0801.2748.Google Scholar
Witten DM, Tibshirani R, Hastie T (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatist. 10(3):515–534.Google Scholar
Xiao X, Li Y, Wen Z, Zhang L (2018) A regularized semi-smooth Newton method with projection steps for composite convex programs. J. Sci. Comput. 76(1):364–389.Google Scholar
Yang J, Sun D, Toh KC (2013) A proximal point algorithm for log-determinant optimization with group lasso regularization. SIAM J. Optim. 23(2):857–893.Google Scholar
Yang L, Sun D, Toh KC (2015) SDPNAL+: A majorized semismooth Newton-CG augmented Lagrangian method for semidefinite programming with nonnegative constraints. Math. Programming Comput. 7(3):331–366.Google Scholar
Yuan XT, Zhang T (2013) Truncated power method for sparse eigenvalue problems. J. Machine Learn. Res. 14(1):899–925.Google Scholar
Zhao X, Sun D, Toh KC (2010) A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM J. Optim. 20(4):1737–1765.Google Scholar
Zhou G, Toh KC (2005) Superlinear convergence of a Newton-type algorithm for monotone equations. J. Optim. Theory Appl. 125(1):205–221.Google Scholar
Zou H, Xue L (2018) A selective overview of sparse principal component analysis. Proc. IEEE 106(8):1311–1320.Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. B 67(2):301–320.Google Scholar
Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J. Comput. Graphical Statist. 15(2):265–286.Google Scholar

cover image INFORMS Journal on Optimization

Volume 2, Issue 3

Summer 2020

Pages 145-228, C3

Article Information

Supplemental Material

Metrics

Information

Received:March 27, 2019
Accepted:March 25, 2020
Published Online:July 24, 2020

Cite as

Shixiang Chen, Shiqian Ma, Lingzhou Xue, Hui Zou (2020) An Alternating Manifold Proximal Gradient Method for Sparse Principal Component Analysis and Sparse Canonical Correlation Analysis. INFORMS Journal on Optimization 2(3):192-208.

https://doi.org/10.1287/ijoo.2019.0032

Keywords

PDF download

Available Issues

Available Issues

Available Issues

An Alternating Manifold Proximal Gradient Method for Sparse Principal Component Analysis and Sparse Canonical Correlation Analysis

References

Volume 2, Issue 3

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News