Convex Optimization for Group Feature Selection in Networked Data
Published Online:18 Jul 2019https://doi.org/10.1287/ijoc.2018.0868
References
- (1998) On the approximability of minimizing nonzero variables or unsatisfied relations in linear systems. Theoret. Comput. Sci. 209(1):237–260.Crossref, Google Scholar
- (2010) Abnormal functional connectivity of default mode sub-networks in autism spectrum disorder patients. NeuroImage 53(1):247–256.Crossref, Google Scholar
- (2011) Convex optimization with sparsity-inducing norms. Sra S, Nowozin S, Wright SJ, eds. Optimization for Machine Learning (MIT Press, Cambridge, MA), 19–53.Crossref, Google Scholar
- (2010) Sparse methods for machine learning theory and algorithms. ECML/PKDD Tutorial, https://www.di.ens.fr/∼fbach/ecml2010tutorial/.Google Scholar
- (2016) Integer programming models for feature selection: New extensions and a randomized solution algorithm. Eur. J. Oper. Res. 250(2):389–399.Crossref, Google Scholar
- (2016) Best subset selection via a modern optimization lens. Ann. Statist. 44(2):813–852.Crossref, Google Scholar
- (2010) Detecting high log-densities: An O(n14) approximation for densest k-subgraph. Mitzenmacher M, Schulman LJ, eds. Proc. 42nd ACM Sympos. Theory Comput. (ACM, New York), 201–210.Crossref, Google Scholar
- (2014) Threshold-based preprocessing for approximating the weighted dense k-subgraph problem. Eur. J. Oper. Res. 234(3):631–640.Crossref, Google Scholar
- (2012) Overview of metrics and their correlation patterns for multiple-metric topology analysis on heterogeneous graph ensembles. Physical Rev. E 85(1):016117.Crossref, Google Scholar
- (1998) Feature selection via concave minimization and support vector machines. Shavlik JW, ed. Proc. 15th Internat. Conf. Machine Learn. (Morgan Kaufmann Publishers, San Francisco), 82–90.Google Scholar
- (2007) Direct convex relaxations of sparse SVM. Ghahramani Z. ed. Proc. 24th Internat. Conf. Machine Learn., 2007 Internat. Conf. Inductive Logical Programming, Corvallis, OR, 145–153.Crossref, Google Scholar
- (1972) Dynamic programming as applied to feature subset selection in a pattern recognition system. Donovan JJ, ed. Proc. ACM Annual Conf., vol. 1 (ACM, New York), 94–103.Crossref, Google Scholar
- (2006) Classification of magnetic resonance brain images using wavelets as input to support vector machine and neural network. Biomedical Signal Process. Control 1(1):86–92.Crossref, Google Scholar
- (2015) Diagnostic classification of intrinsic functional connectivity highlights somatosensory, default mode, and visual regions in autism. NeuroImage: Clin. 8:238–245.Crossref, Google Scholar
- (2008) Training svm with indefinite kernels. McCallum A, Roweis S, eds. Proc. 25th Internat. Conf. Machine Learn. (Omin Press, Madison, WI), 136–143.Crossref, Google Scholar
- (1984) Clustering and domination in perfect graphs. Discrete Appl. Math. 9(1):27–39.Crossref, Google Scholar
- (2009) Disease state prediction from resting state functional connectivity. Magnetic Resonance Medicine 62(6):1619–1628.Crossref, Google Scholar
- (1959) On Random Graphs, vol. 6 (Publicationes Mathematicae, Debrecen, Hungary), 290–297.Google Scholar
- (2001) Solving quadratic semi-infinite programming problems by using relaxed cutting-plane scheme. J. Comput. Appl. Math. 129(1):89–104.Crossref, Google Scholar
- (2000) Data selection for support vector machine classifiers. Ramakrishnan R, ed. Proc. 6th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 64–70.Crossref, Google Scholar
- (2009) Mixed-integer support vector machine. LeCun Y, Zemel R, eds. NIPS Workshop on Optimization for Machine Learning (Curran Associates, Red Hook, NY).Google Scholar
- (2003) An introduction to variable and feature selection. J. Machine Learn. Res. 3(March):1157–1182.Google Scholar
- (2015) l 0-norm based structural sparse least square regression for feature selection. Pattern Recognition 48(12):3927–3940.Crossref, Google Scholar
- (1993) Semi-infinite programming: Theory, methods, and applications. SIAM Rev. 35(3):380–429.Crossref, Google Scholar
- (2016) Resting-state functional connectivity in autism spectrum disorders: A review. Frontiers Psychiatry 7:205.Crossref, Google Scholar
- (2009) Group lasso with overlap and graph lasso. Jones B, ed. Proc. 26th Annual Internat. Conf. Machine Learn. (ACM, New York), 433–440.Crossref, Google Scholar
- (2011) Structured variable selection with sparsity-inducing norms. J. Machine Learn. Res. 12:2777–2824.Google Scholar
- (2012) Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping. Ann. Appl. Statist. 6(3):1095–1117.Crossref, Google Scholar
- (2008) A minimax theorem with applications to machine learning, signal processing, and finance. SIAM J. Optim. 19(3):1344–1367.Crossref, Google Scholar
- (2012) Diagnostic neuroimaging across diseases. NeuroImage 61(2):457–463.Crossref, Google Scholar
- (2004) Learning the kernel matrix with semidefinite programming. J. Machine Learn. Res. 5:27–72.Google Scholar
- (2009a) A convex method for locating regions of interest with multi-instance learning. Altun Y, Das K, Mielikäinen T, Malerba D, Stefanowski J, Read J, Žitnik M, Ceci M, Džeroski S, eds. Machine Learning and Knowledge Discovery in Databases (Springer, New York), 15–30.Crossref, Google Scholar
- (2009b) Tighter and convex maximum margin clustering. van Dyk D, Welling M, eds. Internat. Conf. Artificial Intelligence Statistics (Microtome, Brookline, MA), 344–351.Google Scholar
- (2008) In defense of ℓ0. McCallum A, Roweis S, eds. Proc. 25th Internat. Conf. Machine Learn. (Omni Press, Madison, WI).Google Scholar
- (2009) SLEP: Sparse Learning with Efficient Projections (Arizona State University, Phoenix).Google Scholar
- (2010) Penalty decomposition methods for ℓ0-norm minimization. Technical report, Simon Fraser University, BC, Canada.Google Scholar
- (2000) The edge-weighted clique problem: Valid inequalities, facets and polyhedral computations. Eur. J. Oper. Res. 123(2):346–371.Crossref, Google Scholar
- (2014) Feature selection for support vector machines via mixed integer linear programming. Inform. Sci. 279:163–175.Crossref, Google Scholar
- (2005) Subband correlation and robust speech recognition. IEEE Trans. Speech Audio Process. 13(5):956–964.Crossref, Google Scholar
- (2005) Classifying brain states and determining the discriminating activation patterns: Support vector machine on functional MRI data. NeuroImage 28(4):980–995.Crossref, Google Scholar
- (1995) Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2):227–234.Crossref, Google Scholar
- (2004) Feature selection, ℓ1vs. ℓ2regularization, and rotational invariance. Brodley CE, ed. Proc. 21st Internat. Conf. Machine Learn., vol. 78 (ACM, New York).Crossref, Google Scholar
- (2011) On solving large-scale finite minimax problems using exponential smoothing. J. Optim. Theory Appl. 148(2):390–421.Crossref, Google Scholar
- (2011) Functional network organization of the human brain. Neuron 72(4):665–678.Crossref, Google Scholar
- (2003) Variable selection using svm based criteria. J. Machine Learn. Res. 3:1357–1370.Google Scholar
- (2007) More efficiency in multiple kernel learning. Ghahramani Z, ed. Proc. 24th Internat. Conf. Machine Learn. (Omni Press, Madison, WI), 775–782.Crossref, Google Scholar
- (2008) SimpleMKL. J. Machine Learn. Res. 9:2491–2521.Google Scholar
- (2013) Sparse overlapping sets lasso for multitask learning and its application to fMRI analysis. Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 26 (Curran Associates, Red Hook, NY), 2202–2210.Google Scholar
- (2015) Sparse network-based models for patient classification using fMRI. NeuroImage 105(15):493–506.Crossref, Google Scholar
- (1958) On general minimax theorems. Pacific J. Math. 8(1):171–176.Crossref, Google Scholar
- (2011) Networks of the Brain (MIT Press, Cambridge).Google Scholar
- (2010) Multiple kernel learning and the SMO algorithm. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 23 (Curran Associates, Red Hook, NY), 2361–2369.Google Scholar
- (2010) Learning sparse svm for feature selection on very high dimensional datasets. Fürnkranz J, Joachims T, eds. Proc. 27th Internat. Conf. Machine Learn. (ICML-10) (Omni Press, Madison, WI), 1047–1054.Google Scholar
- (2014) Feature selection for classification: A review. Aggaarwal CC, ed. Data Classification: Algorithms and Applications, vol. 37 (CRC Press, Boca Raton, FL).Google Scholar
- (1996) Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. Ser. B: Methodology 58(1):267–288.Google Scholar
- (2005) Sparsity and smoothness via the fused lasso. J. Royal Statist. Soc.: Ser. B: Statist. Methodology 67(1):91–108.Crossref, Google Scholar
- (2008) A note on adaptive group lasso. Comput. Statist. Data Anal. 52(12):5277–5286.Crossref, Google Scholar
- (2007) On ℓ1-norm multiclass support vector machines. J. Amer. Statist. Assoc. 102(478):583–594.Crossref, Google Scholar
- (2010) Alterations of resting state functional connectivity in the default network in adolescents with autism spectrum disorders. Brain Res. 1313:202–214.Crossref, Google Scholar
- (2003) Use of the zero-norm with linear models and kernel methods. J. Machine Learn. Res. 3:1439–1461.Google Scholar
- (2016) Optimization and machine learning frameworks for complex network analysis. PhD thesis, University of Washington, Seattle.Google Scholar
- (2010) Group variable selection methods and their applications in analysis of genomic data. Feng J, Fu W, Sun F, eds. Frontiers Computational and Systems Biology (Springer, New York), 231–248.Crossref, Google Scholar
- (2009) Non-monotonic feature selection. Leon Bottou L, Littman M, eds. Proc. 26th Annual Internat. Conf. Machine Learn. (Omni Press, Madison, WI), 1145–1152.Crossref, Google Scholar
- (2010) Identifying main effects and epistatic interactions from large-scale snp data via adaptive group lasso. BMC Bioinform. 11(1):S18.Crossref, Google Scholar
- . (2011) The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. 106(3):1125–1165.Crossref, Google Scholar
- (2010) A comparison of optimization methods and software for large-scale l1-regularized linear classification. J. Machine Learn. Res. 11(1–4):3183–3234.Google Scholar
- (2011) Efficient methods for overlapping group lasso. Shawe-Taylor J, Zemel RS, Bartlett PL, Pereira F, Weinberger KQ, eds. Advances in Neural Information Processing Systems, vol. 24 (Curran Associates, Red Hook, NY), 352–360.Google Scholar
- (2006) Model selection and estimation in regression with grouped variables. J. Royal Statist. Soc.: Ser. B: Statist. Methodology 68(1):49–67.Crossref, Google Scholar
- (2012) Identifying major depression using whole-brain functional connectivity: A multivariate pattern analysis. Brain 135(5):1498–1507.Crossref, Google Scholar
- (2008) The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Statist. 36(4)1567–1594.Crossref, Google Scholar
- (2010) A new exchange method for convex semi-infinite programming. SIAM J. Optim. 20(6):2959–2977.Crossref, Google Scholar
- (2006) The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101(476):1418–1429.Crossref, Google Scholar

