Analyzing Network Formation Using Big Data Analytics: A Study of Degree Distribution, Clustering Coefficient, Diameter, and Assortativity

Published Online:https://doi.org/10.1287/mnsc.2023.00475

This research presents an innovative approach to network development which goes beyond conventional models that use connection rules for individual agents. We show that agents can produce diverse network structures through uniform edge searches which involve random connection attempts without preference. This model offers a new viewpoint on network formation by emphasizing edges in contrast to nodes. Within this framework, edges are considerably more crucial than those in other network formation models as they represent tangible interactions such as question-and-answer exchanges, commercial transactions, and coauthorships, rather than abstract relationships. We prove that the model converges to a stationary equilibrium degree distribution, implying that in the long term, the probability of observing certain network characteristics stabilizes and becomes time-invariant. Furthermore, we derive closed‐form asymptotic expressions for key network statistics and validate them through both simulations and real‐world data sets. Our research shows that three core dimensions are inherently in competition with one another: (1) quantity (mean degree), reflecting agents’ overall connectivity; (2) quality (clustering coefficient), capturing local cohesion; and (3) equity (skewness of the in-degree distribution), indicating how strongly connections concentrate on a few nodes. Our results show that raising the search times boosts quantity but lowers quality by diluting clustering, whereas raising the secondary connection probability enhances clustering yet decreases equity by concentrating in-degree and intensifying the Matthew effect. These trade-offs reveal essential limits of network formation while providing a flexible framework which connects theoretical insight to practical decisions in network design and management.

This paper was accepted by J. George Shanthikumar, data science.

Funding: The research of J. Qi is funded by the National Natural Science Foundation of China [Grant 72422005] and the Hong Kong Research Grant Council [Grants GRF 16209923, 16213424, and TRS T32-615/24-R]. Y.-J. Chen acknowledges financial support from the Hong Kong Research Grant Council [Grants C6020-21GF, 16212821, and 16204521].

Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2023.00475.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.