Conformity in Large Language Models

October 16, 2024 · Declared Dead · 🏛 Annual Meeting of the Association for Computational Linguistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos arXiv ID 2410.12428 Category cs.CL: Computation & Language Cross-listed cs.AI Citations 13 Venue Annual Meeting of the Association for Computational Linguistics Last Checked 4 months ago

Abstract

The conformity effect describes the tendency of individuals to align their responses with the majority. Studying this bias in large language models (LLMs) is crucial, as LLMs are increasingly used in various information-seeking and decision-making tasks as conversation partners to improve productivity. Thus, conformity to incorrect responses can compromise their effectiveness. In this paper, we adapt psychological experiments to examine the extent of conformity in popular LLMs. Our findings reveal that all tested models exhibit varying levels of conformity toward the majority, regardless of their initial choice or correctness, across different knowledge domains. Notably, we are the first to show that LLMs are more likely to conform when they are more uncertain in their own prediction. We further explore factors that influence conformity, such as training paradigms and input characteristics, finding that instruction-tuned models are less susceptible to conformity, while increasing the naturalness of majority tones amplifies conformity. Finally, we propose two interventions, Devil's Advocate and Question Distillation, to mitigate conformity, providing insights into building more robust language models.