R.I.P.
๐ป
Ghosted
Integrated Hardware Architecture and Device Placement Search
July 18, 2024 ยท Entered Twilight ยท ๐ International Conference on Machine Learning
Repo contents: .gitignore, Estimator, GraphExtractor, LICENSE, README.md, Solver, arguments.py, exec_modes.py, phaze.py, scripts, setup.sh, third_party_for_phaze, vocabfiles
Authors
Irene Wang, Jakub Tarnawski, Amar Phanishayee, Divya Mahajan
arXiv ID
2407.13143
Category
cs.LG: Machine Learning
Cross-listed
cs.AR,
cs.DC
Citations
3
Venue
International Conference on Machine Learning
Repository
https://github.com/msr-fiddle/phaze
โญ 7
Last Checked
2 months ago
Abstract
Distributed execution of deep learning training involves a dynamic interplay between hardware accelerator architecture and device placement strategy. This is the first work to explore the co-optimization of determining the optimal architecture and device placement strategy through novel algorithms, improving the balance of computational resources, memory usage, and data distribution. Our architecture search leverages tensor and vector units, determining their quantity and dimensionality, and on-chip and off-chip memory configurations. It also determines the microbatch size and decides whether to recompute or stash activations, balancing the memory footprint of training and storage size. For each explored architecture configuration, we use an Integer Linear Program (ILP) to find the optimal schedule for executing operators on the accelerator. The ILP results then integrate with a dynamic programming solution to identify the most effective device placement strategy, combining data, pipeline, and tensor model parallelism across multiple accelerators. Our approach achieves higher throughput on large language models compared to the state-of-the-art TPUv4 and the Spotlight accelerator search framework. The entire source code of PHAZE is available at https://github.com/msr-fiddle/phaze.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
๐ป
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
๐ป
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
๐ป
Ghosted