Federated Learning Under Adversarial Silence Attacks

Lili Su
Lili Su
[email protected]
Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115
Search for more papers by this author
,
Ming Xiang
Corresponding Author
Ming Xiang
[email protected]
https://orcid.org/0000-0001-5958-132X
Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115
Search for more papers by this author
,
Jiaming Xu
Jiaming Xu
[email protected]
Fuqua School of Business, Duke University, Durham, North Carolina 27708
Search for more papers by this author
,
Pengkun Yang
Pengkun Yang
[email protected]
Department of Statistics and Data Science, Tsinghua University, Beijing 100190, China
Search for more papers by this author

Lili Su

[email protected]

Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115

Search for more papers by this author

Ming Xiang

Corresponding Author

Ming Xiang

[email protected]

https://orcid.org/0000-0001-5958-132X

Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115

Search for more papers by this author

Jiaming Xu

[email protected]

Fuqua School of Business, Duke University, Durham, North Carolina 27708

Search for more papers by this author

Pengkun Yang

[email protected]

Department of Statistics and Data Science, Tsinghua University, Beijing 100190, China

Search for more papers by this author

Published Online:11 Jun 2026https://doi.org/10.1287/ijoc.2024.1017

References

Allouah Y, Farhadkhani S, Guerraoui R, Gupta N, Pinot R, Stephan J (2023) Fixing by mixing: A recipe for optimal byzantine ML under heterogeneity. Ruiz F, Dy J, van de Meent JW, eds. Proc. 26th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 206 (PMLR, New York), 1232–1300.Google Scholar
Allouah Y, Farhadkhani S, Guerraoui R, Gupta N, Pinot R, Rizk G, Voitovych S (2024) Byzantine-robust federated learning: Impact of client subsampling and local updates. Salakhutdinov R, Kolter Z, Heller K, Weller A, Oliver N, Scarlett J, Berkenkamp F, eds. Proc. 41st Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 206 (PMLR, New York), 1078–1114.Google Scholar
Arjevani Y, Carmon Y, Duchi JC, Foster DJ, Srebro N, Woodworth B (2023) Lower bounds for non-convex stochastic optimization. Math. Programming 199:165–214.Crossref, Google Scholar
Blanchard P, El Mhamdi EM, Guerraoui R, Stainer J (2017) Machine learning with adversaries: Byzantine tolerant gradient descent. Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 30 (Curran Associates, Inc., Red Hook, NY).Google Scholar
Boche H, Schaefer RF, Poor HV (2020) Denial-of-service attacks on communication systems: Detectability and jammer knowledge. IEEE Trans. Signal Processing 68:3754–3768.Crossref, Google Scholar
Bonomi S, Del Pozzo A, Potop-Butucaru M, Tixeuil S (2019) Approximate agreement under mobile byzantine faults. Theoret. Comput. Sci. 758:17–29.Crossref, Google Scholar
Caldas S, Duddu SMK, Wu P, Li T, Konečnỳ J, McMahan HB, Smith V, Talwalkar A (2018) LEAF: A benchmark for federated settings. Preprint, submitted December 3, https://arxiv.org/abs/1812.01097.Google Scholar
Chen J, Micali S (2016) Algorand. Preprint, submitted July 5, https://arxiv.org/abs/1607.01341.Google Scholar
Chen Y, Su L, Xu J (2017) Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proc. ACM Measurement Anal. Comput. Systems 1(2):1–25.Google Scholar
Cho YJ, Wang J, Joshi G (2022) Towards understanding biased client selection in federated learning. Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. 25th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 151 (PMLR, New York), 10351–10375.Google Scholar
Data D, Diggavi S (2021) Byzantine-resilient high-dimensional SGD with local iterations on heterogeneous data. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 2478–2488.Google Scholar
Farhadkhani S, Guerraoui R, Gupta N, Pinot R, Stephan J (2022) Byzantine machine learning made easy by resilient averaging of momentums. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 162 (PMLR, New York), 6246–6283.Google Scholar
Feng J, Xu H, Mannor S (2014) Distributed robust learning. Preprint, submitted September 21, https://arxiv.org/abs/1409.5937.Google Scholar
Ghadimi S, Lan G (2013) Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4):2341–2368.Crossref, Google Scholar
Ghosh A, Maity RK, Kadhe S, Mazumdar A, Ramachandran K (2020) Communication efficient and byzantine tolerant distributed learning. Kim YH, Oggier F, Wornell G, Yu W, eds. 2020 IEEE Internat. Sympos. Inform. Theory (IEEE, Piscataway, NJ), 2545–2550.Google Scholar
Gu X, Huang K, Zhang J, Huang L (2021) Fast federated learning in the presence of arbitrary device unavailability. Ranzato M, Beygelzimer A, Dauphin Y, Liang P, Vaughan JW, eds. Adv. Neural Inform. Processing Systems, vol. 34 (Curran Associates, Inc., Red Hook, NY), 12052–12064.Google Scholar
Hsu H, Qi H, Brown M (2019) Measuring the effects of non-identical data distribution for federated visual classification. Preprint, submitted September 13, https://arxiv.org/abs/1909.06335.Google Scholar
Jhunjhunwala D, Sharma P, Nagarkatti A, Joshi G (2022) FedVARP: Tackling the variance due to partial client participation in federated learning. Cussens J, Zhang K, eds. Proc. 38th Conf. Uncertainty in Artificial Intelligence, Proceedings of Machine Learning Research, vol. 180 (PMLR, New York), 906–916.Google Scholar
Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, et al. (2021) Advances and open problems in federated learning. Foundations Trends Machine Learn. 14(1–2):1–210. Crossref, Google Scholar
Karimireddy SP, He L, Jaggi M (2021) Learning from history for Byzantine robust optimization. Meila M, Zhang T, eds. Proc. 38th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 139 (PMLR, New York), 5311–5319.Google Scholar
Karimireddy SP, He L, Jaggi M (2022) Byzantine-robust learning on heterogeneous datasets via bucketing. Finn C, Choi Y, Deisenroth M, eds. Internat. Conf. Learn. Representations (Curran Associates, Inc., Red Hook, NY).Google Scholar
Karimireddy SP, Kale S, Mohri M, Reddi S, Stich S, Suresh AT (2020) Scaffold: Stochastic controlled averaging for federated learning. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 119 (PMLR, New York), 5132–5143.Google Scholar
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images, https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11):2278–2324.Crossref, Google Scholar
Li X, Huang K, Yang W, Wang S, Zhang Z (2020b) On the convergence of FedAvg on non-IID data. White M, Cho K, Song D, eds. Internat. Conf. Learn. Representations (Curran Associates, Inc., Red Hook, NY).Google Scholar
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020a) Federated optimization in heterogeneous networks. Proc. Machine Learn. Systems 2:429–450.Google Scholar
Lynch NA (1996) Distributed Algorithms (Morgan Kaufmann, San Mateo).Google Scholar
Malinovsky G, Richtárik P, Horváth S, Gorbunov E (2023) Byzantine robustness and partial participation can be achieved simultaneously: Just clip gradient differences. Preprint, submitted November 23, https://arxiv.org/abs/2311.14127.Google Scholar
McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. Singh A, Zhu J, eds. Proc. 20th Internat. Conf. Artificial Intelligence and Statist., Proceedings of Machine Learning Research, vol. 54 (PMLR, New York), 1273–1282.Google Scholar
Nemirovski A, Juditsky A, Lan G, Shapiro A (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574–1609.Crossref, Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, et al. (2019) PyTorch: An imperative style, high-performance deep learning library. Wallach H, Larochelle H, Beygelzimer A, d’Alch’e-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 32 (Curran Associates, Inc., Red Hook, NY).Google Scholar
Perazzone J, Wang S, Ji M, Chan KS (2022) Communication-efficient device scheduling for federated learning using stochastic optimization. Chen Y, Eryilmaz A, Widmer J, eds. IEEE INFOCOM 2022-IEEE Conf. Comput. Comm. (IEEE, Piscataway, NJ), 1449–1458.Google Scholar
Philippenko C, Dieuleveut A (2020) Bidirectional compression in heterogeneous settings for distributed or federated learning with partial participation: Tight convergence guarantees. Preprint, submitted June 25, https://arxiv.org/abs/2006.14591.Google Scholar
Pillutla K, Kakade SM, Harchaoui Z (2022) Robust aggregation for federated learning. IEEE Trans. Signal Processing 70:1142–1154.Crossref, Google Scholar
Ruan Y, Zhang X, Liang SC, Joe-Wong C (2021) Towards flexible device participation in federated learning. Banerjee A, Fukumizu K, eds. Proc. 24th Internat. Conf. Artificial Intelligence Statist., Proceedings of Machine Learning Research, vol. 130 (PMLR, New York), 3403–3411.Google Scholar
Shamir O, Srebro N, Zhang T (2014) Communication-efficient distributed optimization using an approximate newton-type method. Xing EP, Jebara T, eds. Proc. 31st Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 32 (PMLR, New York), 1000–1008.Google Scholar
Su L, Vaidya NH (2016) Fault-tolerant multi-agent optimization: Optimal iterative distributed algorithms. Pelc A, ed. Proc. 2016 ACM Sympos. Principles Distributed Comput. (ACM, New York), 425–434.Google Scholar
Su L, Xu J (2019) Securing distributed gradient descent in high dimensional statistical learning. Proc. ACM Measurement Anal. Comput. Systems 3(1):1–41.Google Scholar
Su L, Xiang M, Xu J, Yang P (2026) Federated learning under adversarial silence attacks. https://doi.org/10.1287/ijoc.2024.1017.cd, https://github.com/INFORMSJoC/2024.1017.Google Scholar
Sundaram S, Gharesifard B (2015) Consensus-based distributed optimization with malicious nodes. Beck C, Nedich A, Olshevsky A, eds. 2015 53rd Annual Allerton Conf. Comm. Control Comput. (Allerton) (IEEE, Piscataway, NJ), 244–249.Google Scholar
Wang S, Ji M (2022) A unified analysis of federated learning with arbitrary client participation. Oh AH, Agarwal A, Belgrave D, Cho K, eds. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates, Inc., Red Hook, NY), 19124–19137.Google Scholar
Wang S, Ji M (2023) A lightweight method for tackling unknown participation probabilities in federated averaging. Preprint, submitted June 6, https://arxiv.org/abs/2306.03401.Google Scholar
Wang J, Liu Q, Liang H, Joshi G, Poor HV (2020b) Tackling the objective inconsistency problem in heterogeneous federated optimization. Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H, eds. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates, Inc., Red Hook, NY), 7611–7623.Google Scholar
Wang H, Yurochkin M, Sun Y, Papailiopoulos D, Khazaeni Y (2020a) Federated learning with matched averaging. White M, Cho K, Song D, eds. Internat. Conf. Learn. Representations (Curran Associates, Inc., Red Hook, NY).Google Scholar
Xie C, Koyejo S, Gupta I (2019) Zeno: Distributed stochastic gradient descent with suspicion-based fault-tolerance. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 97 (PMLR, New York), 6893–6901.Google Scholar
Yan Y, Niu C, Ding Y, Zheng Z, Tang S, Li Q, Wu F, Lyu C, Feng Y, Chen G (2023) Federated optimization under intermittent client availability. INFORMS J. Comput. 36(1):185–202.Link, Google Scholar
Yang H, Zhang X, Khanduri P, Liu J (2022) Anarchic federated learning. Chaudhuri K, Jegelka S, Song L, Szepesvari C, Niu G, Sabato S, eds. Proc. 39th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 162 (PMLR, New York), 25331–25363.Google Scholar
Yin D, Chen Y, Kannan R, Bartlett P (2018) Byzantine-robust distributed learning: Towards optimal statistical rates. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 80 (PMLR, New York), 5650–5659.Google Scholar
Yuan X, Li P (2022) On convergence of FedProx: Local dissimilarity invariant bounds, non-smoothness and beyond. Oh AH, Agarwal A, Belgrave D, Cho K, eds. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates, Inc., Red Hook, NY), 10752–10765.Google Scholar

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:November 25, 2024
Accepted:May 01, 2026
Published Online:June 11, 2026

Cite as

Lili Su, Ming Xiang, Jiaming Xu, Pengkun Yang (2026) Federated Learning Under Adversarial Silence Attacks. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2024.1017

Keywords

Acknowledgments

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the National Science Foundation, the Army Research Laboratory, or the U.S. Government.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Federated Learning Under Adversarial Silence Attacks

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News