A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models

July 05, 2025 · The Cartographer · 🏛 Annual Meeting of the Association for Computational Linguistics

"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models"

Evidence collected by the PWNC Scanner

Authors Shuliang Liu, Hongyi Liu, Aiwei Liu, Bingchen Duan, Qi Zheng, Yibo Yan, He Geng, Peijie Jiang, Jia Liu, Xuming Hu arXiv ID 2507.05288 Category cs.IR: Information Retrieval Cross-listed cs.AI, cs.CL Citations 2 Venue Annual Meeting of the Association for Computational Linguistics Last Checked 23 hours ago

Abstract

The widespread deployment of large language models (LLMs) across critical domains has amplified the societal risks posed by algorithmically generated misinformation. Unlike traditional false content, LLM-generated misinformation can be self-reinforcing, highly plausible, and capable of rapid propagation across multiple languages, which traditional detection methods fail to mitigate effectively. This paper introduces a proactive defense paradigm, shifting from passive post hoc detection to anticipatory mitigation strategies. We propose a Three Pillars framework: (1) Knowledge Credibility, fortifying the integrity of training and deployed data; (2) Inference Reliability, embedding self-corrective mechanisms during reasoning; and (3) Input Robustness, enhancing the resilience of model interfaces against adversarial attacks. Through a comprehensive survey of existing techniques and a comparative meta-analysis, we demonstrate that proactive defense strategies offer up to 63\% improvement over conventional methods in misinformation prevention, despite non-trivial computational overhead and generalization challenges. We argue that future research should focus on co-designing robust knowledge foundations, reasoning certification, and attack-resistant interfaces to ensure LLMs can effectively counter misinformation across varied domains.