R.I.P.
๐ป
Ghosted
CoopEval: Benchmarking Cooperation-Sustaining Mechanisms and LLM Agents in Social Dilemmas
April 16, 2026 ยท Grace Period ยท + Add venue
Authors
Emanuel Tewolde, Xiao Zhang, David Guzman Piedrahita, Vincent Conitzer, Zhijing Jin
arXiv ID
2604.15267
Category
cs.GT: Game Theory
Cross-listed
cs.AI,
cs.CL,
cs.CY,
cs.MA
Citations
0
Abstract
It is increasingly important that LLM agents interact effectively and safely with other goal-pursuing agents, yet, recent works report the opposite trend: LLMs with stronger reasoning capabilities behave _less_ cooperatively in mixed-motive games such as the prisoner's dilemma and public goods settings. Indeed, our experiments show that recent models -- with or without reasoning enabled -- consistently defect in single-shot social dilemmas. To tackle this safety concern, we present the first comparative study of game-theoretic mechanisms that are designed to enable cooperative outcomes between rational agents _in equilibrium_. Across four social dilemmas testing distinct components of robust cooperation, we evaluate the following mechanisms: (1) repeating the game for many rounds, (2) reputation systems, (3) third-party mediators to delegate decision making to, and (4) contract agreements for outcome-conditional payments between players. Among our findings, we establish that contracting and mediation are most effective in achieving cooperative outcomes between capable LLM models, and that repetition-induced cooperation deteriorates drastically when co-players vary. Moreover, we demonstrate that these cooperation mechanisms become _more effective_ under evolutionary pressures to maximize individual payoffs.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Game Theory
R.I.P.
๐ป
Ghosted
A Motivational Game-Theoretic Approach for Peer-to-Peer Energy Trading in the Smart Grid
R.I.P.
๐ป
Ghosted
Computing Resource Allocation in Three-Tier IoT Fog Networks: a Joint Optimization Approach Combining Stackelberg Game and Matching
R.I.P.
๐ป
Ghosted
Fast Convergence of Regularized Learning in Games
R.I.P.
๐ป
Ghosted
Computation Peer Offloading for Energy-Constrained Mobile Edge Computing in Small-Cell Networks
R.I.P.
๐ป
Ghosted