Entry
Collective Constitutional AI: Aligning a Language Model with Public Input
Saffron Huang, Divya Siddarth, Liane Lovitt, Thomas I. Liao, Esin Durmus, Alex Tamkin, Deep Ganguli
End-to-end pipeline eliciting constitutional principles from ~1,000 Americans via Polis and training a CCAI model on the resulting public-input constitution; lower bias across nine social dimensions while maintaining equivalent task performance.
- AI Organizations are More Effective but Less Aligned than Individual AgentsApril 11, 2026 · arXiv
- Chain of Alignment: Integrating Public Will with Expert Intelligence for Language Model AlignmentOctober 10, 2024 · NeurIPS 2024 Workshop on Pluralistic Alignment
- Position: Towards Bidirectional Human-AI AlignmentJune 13, 2024 · ICML 2024
- Constitutional AI: Harmlessness from AI FeedbackDecember 15, 2022 · arXiv
- Training Language Models to Follow Instructions with Human FeedbackMarch 4, 2022 · NeurIPS 2022
- The Triadic Loop: A Framework for Negotiating Alignment in AI Co-hosted LivestreamingApril 20, 2026 · arXiv