Efficient Parallel Methods for Deep Reinforcement Learning
May 13, 2017 Β· Entered Twilight Β· π arXiv.org
"Last commit was 8.0 years ago (β₯5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, LICENSE.txt, README.md, actor_learner.py, atari_emulator.py, atari_roms, emulator_runner.py, environment.py, environment_creator.py, logger_utils.py, networks.py, paac.py, policy_v_network.py, pretrained, readme_files, runners.py, test.py, train.py
Authors
Alfredo V. Clemente, Humberto N. CastejΓ³n, Arjun Chandra
arXiv ID
1705.04862
Category
cs.LG: Machine Learning
Citations
118
Venue
arXiv.org
Repository
https://github.com/alfredvc/paac
β 201
Last Checked
2 months ago
Abstract
We propose a novel framework for efficient parallelization of deep reinforcement learning algorithms, enabling these algorithms to learn from multiple actors on a single machine. The framework is algorithm agnostic and can be applied to on-policy, off-policy, value based and policy gradient based algorithms. Given its inherent parallelism, the framework can be efficiently implemented on a GPU, allowing the usage of powerful models while significantly reducing training time. We demonstrate the effectiveness of our framework by implementing an advantage actor-critic algorithm on a GPU, using on-policy experiences and employing synchronous updates. Our algorithm achieves state-of-the-art performance on the Atari domain after only a few hours of training. Our framework thus opens the door for much faster experimentation on demanding problem domains. Our implementation is open-source and is made public at https://github.com/alfredvc/paac
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Machine Learning
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
R.I.P.
π»
Ghosted
Semi-Supervised Classification with Graph Convolutional Networks
R.I.P.
π»
Ghosted
Proximal Policy Optimization Algorithms
R.I.P.
π»
Ghosted