Value-Decomposition Networks For Cooperative Multi-Agent Learning

June 16, 2017 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel arXiv ID 1706.05296 Category cs.AI: Artificial Intelligence Citations 1.2K Venue arXiv.org Last Checked 2 months ago

Abstract

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent" problem, which arises due to partial observability. We address these problems by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. We perform an experimental evaluation across a range of partially-observable multi-agent domains and show that learning such value-decompositions leads to superior results, in particular when combined with weight sharing, role information and information channels.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Artificial Intelligence

R.I.P. 👻 Ghosted

A Unified Approach to Interpreting Model Predictions

Scott Lundberg, Su-In Lee

cs.AI 🏛 NeurIPS 📚 30.8K cites 8 years ago

R.I.P. 👻 Ghosted

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, ... (+10 more)

cs.AI 🏛 Inf. Fusion 📚 7.8K cites 6 years ago

R.I.P. 👻 Ghosted

Addressing Function Approximation Error in Actor-Critic Methods

Scott Fujimoto, Herke van Hoof, David Meger

cs.AI 🏛 ICML 📚 6.4K cites 8 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago

R.I.P. 👻 Ghosted

Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge

Peter Clark, Isaac Cowhey, ... (+5 more)

cs.AI 🏛 arXiv 📚 4.0K cites 8 years ago

R.I.P. 👻 Ghosted

Complex Embeddings for Simple Link Prediction

Théo Trouillon, Johannes Welbl, ... (+3 more)

cs.AI 🏛 ICML 📚 3.4K cites 9 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago