FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion

October 27, 2024 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Zhenheng Tang, Yonggang Zhang, Peijie Dong, Yiu-ming Cheung, Amelie Chi Zhou, Bo Han, Xiaowen Chu arXiv ID 2410.20380 Category cs.LG: Machine Learning Cross-listed cs.AI, cs.DC, cs.NI Citations 18 Venue Neural Information Processing Systems Last Checked 4 months ago

Abstract

One-shot Federated Learning (OFL) significantly reduces communication costs in FL by aggregating trained models only once. However, the performance of advanced OFL methods is far behind the normal FL. In this work, we provide a causal view to find that this performance drop of OFL methods comes from the isolation problem, which means that local isolatedly trained models in OFL may easily fit to spurious correlations due to the data heterogeneity. From the causal perspective, we observe that the spurious fitting can be alleviated by augmenting intermediate features from other clients. Built upon our observation, we propose a novel learning approach to endow OFL with superb performance and low communication and storage costs, termed as FuseFL. Specifically, FuseFL decomposes neural networks into several blocks, and progressively trains and fuses each block following a bottom-up manner for feature augmentation, introducing no additional communication costs. Comprehensive experiments demonstrate that FuseFL outperforms existing OFL and ensemble FL by a significant margin. We conduct comprehensive experiments to show that FuseFL supports high scalability of clients, heterogeneous model training, and low memory costs. Our work is the first attempt using causality to analyze and alleviate data heterogeneity of OFL.