Entry
Ask Don't Tell: Reducing Sycophancy in Large Language Models
Magda Dubois, Cozmin Ududec, Christopher Summerfield, Lennart Luettgau
Nested factorial study of how user input framing (epistemic certainty, I-vs-user perspective, affirmation vs negation) provokes sycophancy. Finds sycophancy substantially higher for non-questions, monotonically rising with stated certainty, and amplified by I-perspective. Converting non-questions into questions before answering reduces sycophancy more reliably than a plain anti-sycophancy prompt.
·sycophancy ·epistemic certainty ·prompt framing ·user-affirming responses ·input-level mitigation
- Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language ModelsApril 13, 2026 · arXiv
- A Rational Analysis of the Effects of Sycophantic AIFebruary 15, 2026 · arXiv
- Belief Offloading in Human-AI InteractionFebruary 9, 2026 · arXiv
- How RLHF Amplifies SycophancyFebruary 1, 2026 · arXiv
- Towards Understanding Sycophancy in Language ModelsOctober 20, 2023 · ICLR 2024