Federated Optimization Under Intermittent Client Availability

Published Online:https://doi.org/10.1287/ijoc.2022.0057

References

  • Bonawitz KA, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, et al. (2019) Toward federated learning at scale: System design. Talwalkar A, Smith V, Zaharia M, eds. Proc. Machine Learn. Systems (mlsys.org).Google Scholar
  • Caldas S, Wu P, Li T, Konečný J, McMahan HB, Smith V, Talwalkar A (2018) LEAF: A benchmark for federated settings. Preprint, submitted December 3, https://arxiv.org/abs/1812.01097.Google Scholar
  • Cho YJ, Wang J, Joshi G (2020) Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. Preprint, submitted October 3, https://arxiv.org/abs/2010.01243.Google Scholar
  • Cho YJ, Wang J, Joshi G (2022) Toward understanding biased client selection in federated learning. Camps-Valls G, Ruiz FJR, Valera I, eds. Proc. Internat. Conf. on Artificial Intelligence and Statist. (PMLR, New York), 10351–10375.Google Scholar
  • Duan Jh, Li W, Lu S (2020) Federated learning with decoupled probabilistic-weighted gradient aggregation. Accessed May 6, 2023, https://openreview.net/forum?id=Hw2Za4N5hy0.Google Scholar
  • Eichner H, Koren T, McMahan B, Srebro N, Talwar K (2019) Semi-cyclic stochastic gradient descent. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. on Machine Learning (PMLR, New York), 1764–1773.Google Scholar
  • Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. Accessed February 20, 2022, https://www-cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf.Google Scholar
  • Gu X, Huang K, Zhang J, Huang L (2021) Fast federated learning in the presence of arbitrary device unavailability. Ranzato M, Beygelzimer A, Dauphin YN, Liang P, Vaughan JW, eds. Proc. Annual Conf. on Neural Inform. Processing Systems, 12052–12064.Google Scholar
  • Hard A, Rao K, Mathews R, Beaufays F, Augenstein S, Eichner H, Kiddon C, et al. (2018) Federated learning for mobile keyboard prediction. Preprint, submitted November 8, https://arxiv.org/abs/1811.03604.Google Scholar
  • Hsieh K, Phanishayee A, Mutlu O, Gibbons PB (2020) The non-iid data quagmire of decentralized machine learning. Ii HD, Singh A, eds. Proc. 37th Internat. Conf. on Machine Learn. (PMLR, New York), 4387–4398.Google Scholar
  • Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz KA, et al. (2021) Advances and open problems in federated learning. Foundations Trends Machine Learn. 14(1–2):1–210.CrossrefGoogle Scholar
  • Karimireddy SP, Kale S, Mohri M, Reddi SJ, Stich SU, Suresh AT (2020) SCAFFOLD: Stochastic controlled averaging for federated learning. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. on Machine Learning (PMLR, New York), 5132–5143.Google Scholar
  • Khaled A, Mishchenko K, Richtárik P (2019) First analysis of local GD on heterogeneous data. Preprint, submitted September 10, https://arxiv.org/abs/1909.04715.Google Scholar
  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11):2278–2324.Google Scholar
  • Li M, Andersen DG, Park JW, Smola AJ, Ahmed A, Josifovski V, Long J, et al. (2014) Scaling distributed machine learning with the parameter server. Flinn J, Levy H, eds. Proc. 11th USENIX Sympos. on Operating Systems Design and Implementation (USENIX Association, Berkeley, CA), 583–598.Google Scholar
  • Li T, Sahu AK, Talwalkar A, Smith V (2020a) Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine 37(3):50–60.CrossrefGoogle Scholar
  • Li T, Sanjabi M, Beirami A, Smith V (2020b) Fair resource allocation in federated learning. Proc. 8th Internat. Conf. on Learn. Representations (OpenReview.net).Google Scholar
  • Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V (2020c) Federated optimization in heterogeneous networks. Dhillon IS, Papailiopoulos DS, Sze V, eds. Proc. Machine Learn. Systems (mlsys.org).Google Scholar
  • Li X, Huang K, Yang W, Wang S, Zhang Z (2020d) On the convergence of fedavg on non-iid data. Proc. 8th Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
  • McMahan B, Moore E, Ramage D, Hampson S, Arcas BA (2017) Communication-efficient learning of deep networks from decentralized data. Singh A, Zhu XJ, eds. Proc. 20th Internat. Conf. on Artificial Intelligence and Statist. (PMLR, New York), 1273–1282.Google Scholar
  • Mohri M, Sivek G, Suresh AT (2019) Agnostic federated learning. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learning (PMLR, New York), 4615–4625.Google Scholar
  • Moreno-Torres JG, Raeder T, Alaíz-Rodríguez R, Chawla NV, Herrera F (2012) A unifying view on data set shift in classification. Pattern Recognition 45(1):521–530.CrossrefGoogle Scholar
  • Pennington J, Socher R, Manning CD (2014) GloVe: Global vectors for word representation. Moschitti A, Pang B, Daelemans W, eds. Proc. Conf. on Empirical Methods in Natural Language Processing (Association for Computational Linguistics, Stroudsburg PA), 1532–1543.Google Scholar
  • Quiñonero-Candela J, Sugiyama MS, Schwaighofer A, Lawrence ND (2009) Data Set Shift in Machine Learning (MIT Press, Cambridge, MA).Google Scholar
  • Reddi SJ, Charles Z, Zaheer M, Garrett Z, Rush K, Konečný J, Kumar S, et al. (2021) Adaptive federated optimization. Proc. 9th Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
  • Ribero M, Vikalo H, de Veciana G (2022) Federated learning under intermittent client availability and time-varying communication constraints. Preprint, submitted May 13, https://arxiv.org/abs/2205.06730.Google Scholar
  • Ruan Y, Zhang X, Liang S, Joe-Wong C (2021) Toward flexible device participation in federated learning. Banerjee A, Fukumizu K, eds. Proc. 24th Internat. Conf. on Artificial Intelligence and Statist. (PMLR, New York), 3403–3411.Google Scholar
  • Sharma P, Panda R, Joshi G, Varshney PK (2022) Federated minimax optimization: Improved convergence analyses and algorithms. Chaudhuri K, Jegelka S, Song L, Szepesvári C, Niu G, Sabato S, eds. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 19683–19730.Google Scholar
  • Snoek J, Ovadia Y, Fertig E, Lakshminarayanan B, Nowozin S, Sculley D, Dillon JV, et al. (2019) Can you trust your model’s uncertainty? Evaluating predictive uncertainty under data set shift. Wallach HM, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox EB, Garnett R, eds. Proc. Annual Conf. Neural Inform. Processing Systems, 13969–13980.Google Scholar
  • Stich SU (2019) Local SGD converges fast and communicates little. Proc. 7th Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
  • Stich SU, Karimireddy SP (2019) The error-feedback framework: Better rates for SGD with delayed gradients and compressed communication. Preprint, submitted September 11, https://arxiv.org/abs/1909.05350.Google Scholar
  • Subbaswamy A, Schulam P, Saria S (2019) Preventing failures due to data set shift: Learning predictive models that transport. Chaudhuri K, Sugiyama M, eds. Proc. 22nd Internat. Conf. on Artificial Intelligence and Statist. (PMLR, New York), 3118–3127.Google Scholar
  • Wang H, Yurochkin M, Sun Y, Papailiopoulos DS, Khazaeni Y (2020) Federated learning with matched averaging. Proc. 8th Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
  • Wang J, Joshi G (2018) Cooperative SGD: A unified framework for the design and analysis of communication-efficient SGD algorithms. Preprint, submitted August 22, https://arxiv.org/abs/1808.07576.Google Scholar
  • Wang S, Ruan Y, Tu Y, Wagle S, Brinton CG, Joe-Wong C (2021) Network-aware optimization of distributed learning for fog computing. IEEE/ACM Trans. Networks 29(5):2019–2032.CrossrefGoogle Scholar
  • Xia W, Quek TQS, Guo K, Wen W, Yang HH, Zhu H (2020) Multi-armed bandit-based client scheduling for federated learning. IEEE Trans. Wireless Comm. 19(11):7108–7123.CrossrefGoogle Scholar
  • Yan Y (2023) Federated optimization under intermittent client availability. http://dx.doi.org/10.1287/ijoc.2022.0057.cd, https://github.com/INFORMSJoC/2022.0057.Google Scholar
  • Yan Y, Niu C, Ding Y, Zheng Z, Wu F, Chen G, Tang S, Wu Z (2020) Distributed non-convex optimization with sublinear speedup under intermittent client availability. Preprint, submitted February 18, https://arxiv.org/abs/2002.07399v1.Google Scholar
  • Yang T, Andrew G, Eichner H, Sun H, Li W, Kong N, Ramage D, Beaufays F (2018) Applied federated learning: Improving google keyboard query suggestions. Preprint, submitted December 7, https://arxiv.org/abs/1812.02903.Google Scholar
  • Yu H, Yang S, Zhu S (2019) Parallel restarted SGD with faster convergence and less communication: Demystifying why model averaging works for deep learning. Hentenryck PV, Zhou Z, eds. Proc. 33rd AAAI Conf. on Artificial Intelligence; 31st Innovative Applications of Artificial Intelligence Conf.; and 9th AAAI Sympos. on Edu. Adv. in Artificial Intelligence (AAAI Press, Palo Alto, CA), 5693–5700.Google Scholar
  • Yurochkin M, Agarwal M, Ghosh S, Greenewald KH, Hoang TN, Khazaeni Y (2019) Bayesian nonparametric federated learning of neural networks. Proc. 36th Internat. Conf. on Machine Learning (PMLR, New York), 7252–7261.Google Scholar
  • Zhou Y, Tang S (2020) Differentially private distributed learning. INFORMS J. Comput. 32(3):779–789.LinkGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.