Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

July 27, 2020 · Declared Dead · 🏛 Neural Information Processing Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong arXiv ID 2007.13544 Category cs.GT: Game Theory Cross-listed cs.AI, cs.LG Citations 162 Venue Neural Information Processing Systems Last Checked 2 months ago

Abstract

The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas hold'em poker, while using far less domain knowledge than any prior poker AI.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Game Theory

R.I.P. 👻 Ghosted

Cycles in adversarial regularized learning

Panayotis Mertikopoulos, Christos Papadimitriou, Georgios Piliouras

cs.GT 🏛 SODA 📚 349 cites 8 years ago

R.I.P. 👻 Ghosted

A Motivational Game-Theoretic Approach for Peer-to-Peer Energy Trading in the Smart Grid

Wayes Tushar, Tapan Kumar Saha, ... (+5 more)

cs.GT 🏛 Applied Energy 📚 331 cites 7 years ago

R.I.P. 👻 Ghosted

Computing Resource Allocation in Three-Tier IoT Fog Networks: a Joint Optimization Approach Combining Stackelberg Game and Matching

Huaqing Zhang, Yong Xiao, ... (+4 more)

cs.GT 🏛 IEEE IoTJ 📚 308 cites 9 years ago

R.I.P. 👻 Ghosted

Fast Convergence of Regularized Learning in Games

Vasilis Syrgkanis, Alekh Agarwal, ... (+2 more)

cs.GT 🏛 NeurIPS 📚 296 cites 10 years ago

R.I.P. 👻 Ghosted

Computation Peer Offloading for Energy-Constrained Mobile Edge Computing in Small-Cell Networks

Lixing Chen, Sheng Zhou, Jie Xu

cs.GT 🏛 IEEE/ACM ToN 📚 285 cites 9 years ago

R.I.P. 👻 Ghosted

Blockchain Mining Games

Aggelos Kiayias, Elias Koutsoupias, ... (+2 more)

cs.GT 🏛 EC 📚 273 cites 9 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago