SAGDA: Achieving $\mathcal{O}(ε^{-2})$ Communication Complexity in Federated Min-Max Learning

October 02, 2022 · Declared Dead · 🏛 NeurIPS 2022

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Haibo Yang, Zhuqing Liu, Xin Zhang, Jia Liu arXiv ID 2210.00611 Category cs.LG: Machine Learning Cross-listed cs.AI Citations 0 Venue NeurIPS 2022 Last Checked 4 months ago

Abstract

To lower the communication complexity of federated min-max learning, a natural approach is to utilize the idea of infrequent communications (through multiple local updates) same as in conventional federated learning. However, due to the more complicated inter-outer problem structure in federated min-max learning, theoretical understandings of communication complexity for federated min-max learning with infrequent communications remain very limited in the literature. This is particularly true for settings with non-i.i.d. datasets and partial client participation. To address this challenge, in this paper, we propose a new algorithmic framework called stochastic sampling averaging gradient descent ascent (SAGDA), which i) assembles stochastic gradient estimators from randomly sampled clients as control variates and ii) leverages two learning rates on both server and client sides. We show that SAGDA achieves a linear speedup in terms of both the number of clients and local update steps, which yields an $\mathcal{O}(ε^{-2})$ communication complexity that is orders of magnitude lower than the state of the art. Interestingly, by noting that the standard federated stochastic gradient descent ascent (FSGDA) is in fact a control-variate-free special version of SAGDA, we immediately arrive at an $\mathcal{O}(ε^{-2})$ communication complexity result for FSGDA. Therefore, through the lens of SAGDA, we also advance the current understanding on communication complexity of the standard FSGDA method for federated min-max learning.