Accelerating Deep Neural Network guided MCTS using Adaptive Parallelism
October 09, 2023 ยท Declared Dead ยท ๐ SC Workshops
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yuan Meng, Qian Wang, Tianxin Zu, Viktor Prasanna
arXiv ID
2310.05313
Category
cs.PF: Performance
Cross-listed
cs.DC
Citations
1
Venue
SC Workshops
Last Checked
2 months ago
Abstract
Deep Neural Network guided Monte-Carlo Tree Search (DNN-MCTS) is a powerful class of AI algorithms. In DNN-MCTS, a Deep Neural Network model is trained collaboratively with a dynamic Monte-Carlo search tree to guide the agent towards actions that yields the highest returns. While the DNN operations are highly parallelizable, the search tree operations involved in MCTS are sequential and often become the system bottleneck. Existing MCTS parallel schemes on shared-memory multi-core CPU platforms either exploit data parallelism but sacrifice memory access latency, or take advantage of local cache for low-latency memory accesses but constrain the tree search to a single thread. In this work, we analyze the tradeoff of these parallel schemes and develop performance models for both parallel schemes based on the application and hardware parameters. We propose a novel implementation that addresses the tradeoff by adaptively choosing the optimal parallel scheme for the MCTS component on the CPU. Furthermore, we propose an efficient method for searching the optimal communication batch size as the MCTS component on the CPU interfaces with DNN operations offloaded to an accelerator (GPU). Using a representative DNN-MCTS algorithm - Alphazero on board game benchmarks, we show that the parallel framework is able to adaptively generate the best-performing parallel implementation, leading to a range of $1.5\times - 3\times$ speedup compared with the baseline methods on CPU and CPU-GPU platforms.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Performance
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
A General Formula for the Stationary Distribution of the Age of Information and Its Application to Single-Server Queues
R.I.P.
๐ป
Ghosted
AI Benchmark: All About Deep Learning on Smartphones in 2019
R.I.P.
๐ป
Ghosted
BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning
R.I.P.
๐ป
Ghosted
Online normalizer calculation for softmax
R.I.P.
๐ป
Ghosted
CLTune: A Generic Auto-Tuner for OpenCL Kernels
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted