Graph-Based Feature Selection Method Under Budget Constraint for Multiclass Classification Problems
Published Online:5 Jun 2025https://doi.org/10.1287/ijds.2024.0050
References
- (2024) Conjecturing-based discovery of patterns in data. INFORMS J. Data Sci. 3(2):179–202.Link, Google Scholar
- (2016) The case for process fairness in learning: Feature selection for fair decision making. Lee DD, von Luxburg U, Garnett R, Sugiyama M, Guyon I, eds. NIPS ‘16 Proc. 30th Internat. Conf. Neural Inform. Processing Systems. Sympos. Machine Learn. Law (Curran Associates, Red Hook, NY).Google Scholar
- (2018) Beyond distributive fairness in algorithmic decision making: Feature selection for procedurally fair learning. Proc. AAAI Conf. Artificial Intelligence, vol. 32, no. 1 (AAAI, Washington, DC).Google Scholar
- (2006) Evolving clusters in gene-expression data. Inform. Sci. 176(13):1898–1927.Google Scholar
- (2020a) Feature selection methods for cost-constrained classification in random forests. Preprint, submitted August 14, https://arxiv.org/abs/2008.06298.Google Scholar
- (2020b) Cost-constrained feature selection in binary classification: Adaptations for greedy forward selection and genetic algorithms. BMC Bioinformatics 21(1):26.Google Scholar
- (2019) Opportunistic learning: Budgeted cost-sensitive learning from data streams. Preprint, submitted January 2, https://arxiv.org/abs/1901.00243.Google Scholar
- (2018) Simultaneous high-probability bounds on the false discovery proportion in structured, regression and online settings. Preprint, submitted March 19, https://arxiv.org/abs/1803.06790.Google Scholar
- (2023) Cost-sensitive best subset selection for logistic regression: A mixed-integer conic optimization perspective. Seipel D, Steen A, eds. KI 2023 Adv. Artificial Intelligence 46th German Conf. AI Proc. (Springer Verlag, Berlin), 114–129.Google Scholar
- (2024) GB-AFS: Graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette. J. Big Data 11(1):79.Google Scholar
- (2017) Feature selection: A data perspective. ACM Comput. Surveys (CSUR) 50(6):1–45.Google Scholar
- (2012) Feature Selection for Knowledge Discovery and Data Mining, The Springer International Series in Engineering and Computer Science, vol. 454 (Springer Science & Business Media, New York).Google Scholar
- (1987) A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika 74(4):817–827.Google Scholar
- Microsoft (2019) Microsoft malware prediction. Kaggle, https://www.kaggle.com/c/microsoft-malware-prediction/data.Google Scholar
- (2016) Semi-greedy heuristics for feature selection with test cost constraints. Granular Comput. 1(3):199–211.Google Scholar
- (2014) Feature selection with test cost constraint. Internat. J. Approximate Reasoning 55(1):167–179.Google Scholar
- (2021) Budget constrained machine learning for early prediction of adverse outcomes for covid-19 patients. Sci. Rep. 11(1):19543.Google Scholar
- NHANES (1999–2016) National health and nutrition examination survey. Survey methods and analytic guidelines. National Center for Health Statistics, Hyattsville, MD, https://wwwn.cdc.gov/nchs/nhanes/analyticguidelines.aspx.Google Scholar
- (2020) Gtzan dataset—Music genre classification. Kaggle, https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification.Google Scholar
- (2022) A review of feature selection methods for machine learning-based disease risk prediction. Frontiers Bioinformatics 2:927312.Google Scholar
- (1986) Induction of decision trees. Machine Learn. 1:81–106.Google Scholar
- (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learn. 53:23–69.Google Scholar
- (1987) Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20:53–65.Google Scholar
- (1986) Akaike Information Criterion Statistics (D. Reidel, Dordrecht, Netherlands).Google Scholar
- (2011) Finding a “kneedle” in a haystack: Detecting knee points in system behavior. 2011 31st Internat. Conf. Distributed Comput. Systems Workshops (IEEE, New York), 166–171.Google Scholar
- (2019) Predicting profitable customer segments. Kaggle, https://www.kaggle.com/datasets/tsiaras/predicting-profitable-customer-segments.Google Scholar
- (2008) Visualizing data using t-SNE. J. Machine Learn. Res. 9(86):2579–2605.Google Scholar
- (2018) Unsupervised segmentation evaluation using area-weighted variance and Jeffries-Matusita distance for remote sensing images. Remote Sensing 10(8):1193.Google Scholar
- (1982) Algorithmic complexity: Three NP-hard problems in computational statistics. J. Statist. Comput. Simulation 15(1):17–25.Google Scholar
- (2020) Minimizing and recovering from the effect of concept drift via feature selection. 24th Eur. Conf. Artificial Intelligence - ECAI 2020 (IOS Press, Amsterdam), 1611–1617.Google Scholar
- (2015) Budget constrained non-monotonic feature selection. Neural Networks 71:214–224.Google Scholar
- (2022) Controlling costs: Feature selection on a budget. Stat 11(1):e427.Google Scholar
- (2019) Fairness constraints: A flexible approach for fair classification. J. Machine Learn. Res. 20(75):1–42.Google Scholar
- (2011) Adaptive forward-backward greedy algorithm for learning sparse representations. IEEE Trans. Inform. Theory 57(7):4689–4708.Google Scholar
- (2019) Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm. Expert Systems Appl. 137:46–58.Google Scholar

