Minimax Strikes Back
December 19, 2020 Β· Declared Dead Β· π Adaptive Agents and Multi-Agent Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Quentin Cohen-Solal, Tristan Cazenave
arXiv ID
2012.10700
Category
cs.AI: Artificial Intelligence
Citations
17
Venue
Adaptive Agents and Multi-Agent Systems
Last Checked
4 months ago
Abstract
Deep Reinforcement Learning reaches a superhuman level of play in many complete information games. The state of the art algorithm for learning with zero knowledge is AlphaZero. We take another approach, AthΓ©nan, which uses a different, Minimax-based, search algorithm called Descent, as well as different learning targets and that does not use a policy. We show that for multiple games it is much more efficient than the reimplementation of AlphaZero: Polygames. It is even competitive with Polygames when Polygames uses 100 times more GPU (at least for some games). One of the keys to the superior performance is that the cost of generating state data for training is approximately 296 times lower with AthΓ©nan. With the same reasonable ressources, AthΓ©nan without reinforcement heuristic is at least 7 times faster than Polygames and much more than 30 times faster with reinforcement heuristic.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted