H Human–AI Coevolution

Entry

Ask Don't Tell: Reducing Sycophancy in Large Language Models

Magda Dubois, Cozmin Ududec, Christopher Summerfield, Lennart Luettgau

Synopsis

Nested factorial study of how user input framing (epistemic certainty, I-vs-user perspective, affirmation vs negation) provokes sycophancy. Finds sycophancy substantially higher for non-questions, monotonically rising with stated certainty, and amplified by I-perspective. Converting non-questions into questions before answering reduces sycophancy more reliably than a plain anti-sycophancy prompt.

Keywords

·sycophancy ·epistemic certainty ·prompt framing ·user-affirming responses ·input-level mitigation

Open paper ↗ arXiv ↗ Report issue ↗

Related entries