Entry
CooperBench: Why Coding Agents Cannot be Your Teammates Yet
Arpandeep Khatua, Hao Zhu, Peter Tran, Arya Prabhudesai, Frederic Sadrieh, Johann K. Lieberwirth, Xinkai Yu, Yicheng Fu, Michael J. Ryan, Jiaxin Pei, Diyi Yang
600+ collaborative coding tasks across 12 libraries / 4 languages; agents achieve 30% lower success rates when working together vs. solo.
·coding agents ·cooperation ·benchmark ·multi-agent ·communication
- SWE-chat: Coding Agent Interactions From Real Users in the WildApril 22, 2026 · arXiv
- AI Organizations are More Effective but Less Aligned than Individual AgentsApril 11, 2026 · arXiv
- Institutional AI: Governing LLM Collusion in Multi-Agent Cournot Markets via Public Governance GraphsJanuary 16, 2026 · arXiv
- A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI AgentsDecember 23, 2025 · arXiv
- Governance-as-a-Service: A Multi-Agent Framework for AI System Compliance and Policy EnforcementAugust 26, 2025 · arXiv
- Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM AgentsApril 25, 2024 · NeurIPS 2024