Graph-Based Feature Selection Method Under Budget Constraint for Multiclass Classification Problems

Published Online:https://doi.org/10.1287/ijds.2024.0050

This paper introduces a novel graph-based method for budget-constrained feature selection (GB-BC-FS) in multiclass classification problems. The method identifies a subset of features that complement each other’s ability to distinguish between different classes, thereby utilizing the entire feature space while maintaining the model’s predictive performance and adhering to budget constraints on feature costs. This is achieved through an intuitive heuristic based on a scoring function, allowing users to calibrate the solution provided by GB-BC-FS. The calibration prioritizes selecting features with complementary qualities while minimizing the costs associated with feature collection, under constraint compliance. The approach is designed to handle practical limitations, making it suitable for applications where resources like cost and time are constrained. This not only improves computational efficiency but also aligns with broader implications related to optimizing resource utilization and ensuring practical applicability in data-driven industries. The effectiveness of GB-BC-FS was validated through extensive experimental analysis, including two comprehensive experiments with a real case study. These experiments demonstrated that GB-BC-FS significantly outperforms existing state-of-the-art approaches, achieving an average accuracy improvement of 10.4% and saving an average of 85.17% in run time compared with finding the optimal set of features, all while adhering to budget limits. Our code is fully documented and available online at https://github.com/davidlevinwork/gbfs/.

Funding: This work was supported by the Israeli Ministry of Innovation, Science and Technology [Grant 0004323].

Data Ethics & Reproducibility Note: The code capsule is available at https://github.com/davidlevinwork/gbfs/ and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2024.0050).

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.