COMET: An Interactive Framework for Efficient and Effective Community Search via Active Learning

Published Online:https://doi.org/10.1287/ijoc.2024.0834

In recent years, substantial advancements in query-dependent community search (CS) have been driven by growing demands in various downstream applications such as social network analysis, fraud detection, bioinformatics, and others. They require methods to identify structurally cohesive communities that are dependent on specific queries. Learning-based interactive CS (ICS) models the search process as multiround with human interaction, enhancing its practicality. Nonetheless, learning-based approaches for ICS face two challenges. First, current methods for narrowing the search space rely on either query information or fixed topological structures, resulting in insufficient robustness when querying communities on large-scale graphs. Second, there is an absence of an effective interaction strategy in ICS, where the algorithm should offer users choices of highly uncertain nodes to iteratively refine search quality. To address these issues, we propose COMET, an interactive community search framework designed for large-scale graphs. COMET consists of three key modules: First, it features a community-aware subgraph module tailored to each specific query based on Personalized PageRank (PPR), considering both query information and topological structure. Second, we conceptualize ICS as a series of binary classification tasks, employing a graph neural network (GNN) to propagate label information within the candidate subgraph in each round. Finally, a novel active learning–based node selection module uses entropy from GNN and PPR from the subgraph module to dynamically select the most crucial nodes for labeling in each round. Extensive experimental evaluations demonstrate that COMET significantly outperforms state-of-the-art learning-based CS and ICS methods across eight real-world data sets.

History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning.

Funding: K. Wang was supported by the National Natural Science Foundation of China [Grants 72221001 and 62302294].

Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information (https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0834) as well as from the IJOC GitHub software repository (https://github.com/INFORMSJoC/2024.0834). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.