R.I.P.
π»
Ghosted
Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria
June 09, 2026 Β· Grace Period Β· π IJCAI 2026
Authors
Wongyu Lee, Francesco Lelli, Omran Ayoub, Massimo Tornatore
arXiv ID
2606.11284
Category
cs.MA: Multiagent Systems
Cross-listed
cs.GT,
cs.LG
Citations
0
Venue
IJCAI 2026
Abstract
Real-world multi-agent systems, from traffic coordination to resource allocation, are often modeled as general-sum games where individual incentives conflict with collective welfare. In these settings, the central challenge is not merely finding an equilibrium, but selecting socially desirable outcomes among many suboptimal Nash equilibria. Standard deep multi-agent reinforcement learning (MARL) methods struggle with this problem, as value-decomposition approaches are constrained by monotonicity assumptions and policy-gradient methods often converge to stable but socially inefficient equilibria. To address this limitation, we propose $Ξ¦$-Actor-Critic ($Ξ¦$-AC), a framework that leverages swap regret minimization to steer learning toward high-welfare correlated equilibria (CE). To make counterfactual regret estimation tractable in deep MARL, $Ξ¦$-AC employs a centralized attention critic that predicts vector-valued regrets in a single forward pass, avoiding computationally expensive counterfactual simulations. We further introduce a Lagrangian-based equilibrium selection mechanism that optimizes social welfare while enforcing stability through regret constraints. Experiments on matrix games, Multi-Agent Particle Environments (MPE), and the Melting Pot Harvest scenario demonstrate that $Ξ¦$-AC learns efficient and stable coordination strategies across diverse mixed-motive settings while maintaining high collective return and competitive fairness.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Multiagent Systems
R.I.P.
π»
Ghosted
Mean Field Multi-Agent Reinforcement Learning
π
π
The Cartographer
A Survey and Critique of Multiagent Deep Reinforcement Learning
π
π
The Cartographer
A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
π
π
The Cartographer
Collaborative vehicle routing: a survey
R.I.P.
π»
Ghosted