Towards Taming the Resource and Data Heterogeneity in Federated Learning


Zheng Chai, George Mason University; Hannan Fayyaz, York University; Zeshan Fayyaz, Ryerson University; Ali Anwar, Yi Zhou, Nathalie Baracaldo, and Heiko Ludwig, IBM Research–Almaden; Yue Cheng, George Mason University


Machine learning model training often require data from multiple parties. However, in some cases, data owners cannot or are not willing to share their data due to legal or privacy constraints but would still like to benefit from training a model jointly with multiple parties. To this end, federated learning (FL) has emerged as an alternative way to do collaborative model training without sharing the training data. Such collaboration leads to more accurate and performant models than any party owning a partial set of all the data sources could hope to learn in isolation.

In this paper, we study the impact of resource (e.g., CPU, memory, and network resources) and data (e.g., training dataset sizes) heterogeneity on the training time of FL. Then, we discuss the research problems and their challenges involved in taming such resource and data heterogeneity in FL systems.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@inproceedings {232971,
author = {Zheng Chai and Hannan Fayyaz and Zeshan Fayyaz and Ali Anwar and Yi Zhou and Nathalie Baracaldo and Heiko Ludwig and Yue Cheng},
title = {Towards Taming the Resource and Data Heterogeneity in Federated Learning},
booktitle = {2019 {USENIX} Conference on Operational Machine Learning (OpML 19)},
year = {2019},
isbn = {978-1-939133-00-7},
address = {Santa Clara, CA},
pages = {19--21},
url = {},
publisher = {{USENIX} Association},