Multi-agent Reinforcement Learning in Sequential Social Dilemmas

February 10, 2017 · Declared Dead · 🏛 Adaptive Agents and Multi-Agent Systems

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel arXiv ID 1702.03037 Category cs.MA: Multiagent Systems Cross-listed cs.AI, cs.GT, cs.LG Citations 667 Venue Adaptive Agents and Multi-Agent Systems Last Checked 1 month ago

Abstract

Matrix games like Prisoner's Dilemma have guided research on social dilemmas for decades. However, they necessarily treat the choice to cooperate or defect as an atomic action. In real-world social dilemmas these choices are temporally extended. Cooperativeness is a property that applies to policies, not elementary actions. We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions. We analyze the dynamics of policies learned by multiple self-interested independent learning agents, each using its own deep Q-network, on two Markov games we introduce here: 1. a fruit Gathering game and 2. a Wolfpack hunting game. We characterize how learned behavior in each domain changes as a function of environmental factors including resource abundance. Our experiments show how conflict can emerge from competition over shared resources and shed light on how the sequential nature of real world social dilemmas affects cooperation.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Multiagent Systems

R.I.P. 👻 Ghosted

Mean Field Multi-Agent Reinforcement Learning

Yaodong Yang, Rui Luo, ... (+4 more)

cs.MA 🏛 ICML 📚 660 cites 8 years ago

R.I.P. 👻 Ghosted

A Survey and Critique of Multiagent Deep Reinforcement Learning

Pablo Hernandez-Leal, Bilal Kartal, Matthew E. Taylor

cs.MA 🏛 Autonomous Agents and Multi-Agent Systems 📚 657 cites 7 years ago

R.I.P. 👻 Ghosted

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity

Pablo Hernandez-Leal, Michael Kaisers, ... (+2 more)

cs.MA 🏛 arXiv 📚 310 cites 8 years ago

R.I.P. 👻 Ghosted

Collaborative vehicle routing: a survey

Margaretha Gansterer, Richard F. Hartl

cs.MA 🏛 European Journal of Operational Research 📚 284 cites 8 years ago

R.I.P. 👻 Ghosted

Deep Reinforcement Learning for Swarm Systems

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

cs.MA 🏛 JMLR 📚 229 cites 7 years ago

R.I.P. 👻 Ghosted

A Survey of Deep Reinforcement Learning in Video Games

Kun Shao, Zhentao Tang, ... (+3 more)

cs.MA 🏛 arXiv 📚 226 cites 6 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago