KnowShiftQA: How Robust are RAG Systems when Textbook Knowledge Shifts in K-12 Education?

December 12, 2024 · Declared Dead · 🏛 Annual Meeting of the Association for Computational Linguistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Tianshi Zheng, Weihan Li, Jiaxin Bai, Weiqi Wang, Yangqiu Song arXiv ID 2412.08985 Category cs.CL: Computation & Language Citations 2 Venue Annual Meeting of the Association for Computational Linguistics Last Checked 4 months ago

Abstract

Retrieval-Augmented Generation (RAG) systems show remarkable potential as question answering tools in the K-12 Education domain, where knowledge is typically queried within the restricted scope of authoritative textbooks. However, discrepancies between these textbooks and the parametric knowledge inherent in Large Language Models (LLMs) can undermine the effectiveness of RAG systems. To systematically investigate RAG system robustness against such knowledge discrepancies, we introduce KnowShiftQA. This novel question answering dataset simulates these discrepancies by applying deliberate hypothetical knowledge updates to both answers and source documents, reflecting how textbook knowledge can shift. KnowShiftQA comprises 3,005 questions across five subjects, designed with a comprehensive question typology focusing on context utilization and knowledge integration. Our extensive experiments on retrieval and question answering performance reveal that most RAG systems suffer a substantial performance drop when faced with these knowledge discrepancies. Furthermore, questions requiring the integration of contextual (textbook) knowledge with parametric (LLM) knowledge pose a significant challenge to current LLMs.