H Human–AI Coevolution

Entry

Secret Collusion among AI Agents: Multi-Agent Deception via Steganography

Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H.S. Torr, Lewis Hammond, Christian Schroeder de Witt

Synopsis

Formalises steganographic secret collusion among generative AI agents and evaluates current models — current capabilities limited but GPT-4 shows a capability jump; proposes monitoring and mitigation including paraphrasing.

Keywords

·secret collusion ·steganography ·multi-agent deception ·governance ·mitigation

Open paper ↗ arXiv ↗ Report issue ↗

Related entries