Privacy Protection and Statistical Efficiency Trade-Off for Federated Learning

Published Online:https://doi.org/10.1287/ijoc.2024.0554

Federated learning is a novel framework for distributed learning, which aims to break isolated data islands, as well as protect data privacy. To further prevent privacy leakage by specially crafted attacks, differential privacy is often integrated. Although differential privacy effectively secures sensitive information, it can reduce the statistical efficiency of the resulting estimators. This leads to a trade-off relationship between statistical efficiency and privacy protection. To theoretically understand this relationship, we start with the classic linear regression model and a noise-adding federated gradient descent algorithm. Its numerical convergence properties and asymptotic properties are rigorously studied. This results in fruitful insights into the trade-off relationship between statistical efficiency and privacy protection. Guided by these theoretical understandings, we further develop a Polyak-Ruppert-type averaged estimator, which can achieve good statistical efficiency with guaranteed privacy protection. Extensive simulation studies are presented to corroborate our theoretical results. Finally, we illustrate the application of our proposed method on an enterprise community data set.

History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning.

Funding: Financial support from the National Natural Science Foundation of China [Grants 12401386, 72371241, 72495123, and 12271012], the Ministry of Education Project of Key Research Institute of Humanities and Social Sciences [Grant 22JJD910001], the Postdoctoral Fellowship Program of China Postdoctoral Science Foundation [Grant GZB20230070], and the Beijing Municipal Social Science Foundation [Grant 24GLC033] is gratefully acknowledged.

Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information (https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2024.0554) as well as from the IJOC GitHub software repository (https://github.com/INFORMSJoC/2024.0554). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.