Beyond A Single AI Cluster: A Survey of Decentralized LLM Training
March 14, 2025 ยท The Cartographer ยท ๐ Conference on Empirical Methods in Natural Language Processing
"No code URL or promise found in abstract"
"Title-pattern auto-detect: Beyond A Single AI Cluster: A Survey of Decentralized LLM Training"
Evidence collected by the PWNC Scanner
Authors
Haotian Dong, Jingyan Jiang, Rongwei Lu, Jiajun Luo, Jiajun Song, Bowen Li, Ying Shen, Zhi Wang
arXiv ID
2503.11023
Category
cs.DC: Distributed Computing
Citations
8
Venue
Conference on Empirical Methods in Natural Language Processing
Last Checked
23 hours ago
Abstract
The emergence of large language models (LLMs) has revolutionized AI development, yet the resource demands beyond a single cluster or even datacenter, limiting accessibility to well-resourced organizations. Decentralized training has emerged as a promising paradigm to leverage dispersed resources across clusters, datacenters and regions, offering the potential to democratize LLM development for broader communities. As the first comprehensive exploration of this emerging field, we present decentralized LLM training as a resource-driven paradigm and categorize existing efforts into community-driven and organizational approaches. We further clarify this through: (1) a comparison with related paradigms, (2) a characterization of decentralized resources, and (3) a taxonomy of recent advancements. We also provide up-to-date case studies and outline future directions to advance research in decentralized LLM training.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Distributed Computing
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Reproducing GW150914: the first observation of gravitational waves from a binary black hole merger
R.I.P.
๐ป
Ghosted
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
R.I.P.
๐ป
Ghosted
Efficient Architecture-Aware Acceleration of BWA-MEM for Multicore Systems
R.I.P.
๐ป
Ghosted
Adaptive Federated Learning in Resource Constrained Edge Computing Systems
R.I.P.
๐ป
Ghosted