Learning in two-player games between transparent opponents
December 04, 2020 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Adrian Hutter
arXiv ID
2012.02671
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.GT,
cs.LG
Citations
6
Venue
arXiv.org
Last Checked
4 months ago
Abstract
We consider a scenario in which two reinforcement learning agents repeatedly play a matrix game against each other and update their parameters after each round. The agents' decision-making is transparent to each other, which allows each agent to predict how their opponent will play against them. To prevent an infinite regress of both agents recursively predicting each other indefinitely, each agent is required to give an opponent-independent response with some probability at least epsilon. Transparency also allows each agent to anticipate and shape the other agent's gradient step, i.e. to move to regions of parameter space in which the opponent's gradient points in a direction favourable to them. We study the resulting dynamics experimentally, using two algorithms from previous literature (LOLA and SOS) for opponent-aware learning. We find that the combination of mutually transparent decision-making and opponent-aware learning robustly leads to mutual cooperation in a single-shot prisoner's dilemma. In a game of chicken, in which both agents try to manoeuvre their opponent towards their preferred equilibrium, converging to a mutually beneficial outcome turns out to be much harder, and opponent-aware learning can even lead to worst-case outcomes for both agents. This highlights the need to develop opponent-aware learning algorithms that achieve acceptable outcomes in social dilemmas involving an equilibrium selection problem.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted