A Systematic Literature Review of Retrieval-Augmented Generation: Techniques, Metrics, and Challenges
August 08, 2025 ยท Declared Dead ยท ๐ Big Data and Cognitive Computing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Andrew Brown, Muhammad Roman, Barry Devereux
arXiv ID
2508.06401
Category
cs.DL: Digital Libraries
Cross-listed
cs.AI,
cs.CL,
cs.IR
Citations
8
Venue
Big Data and Cognitive Computing
Last Checked
2 months ago
Abstract
This systematic review of the research literature on retrieval-augmented generation (RAG) provides a focused analysis of the most highly cited studies published between 2020 and May 2025. A total of 128 articles met our inclusion criteria. The records were retrieved from ACM Digital Library, IEEE Xplore, Scopus, ScienceDirect, and the Digital Bibliography and Library Project (DBLP). RAG couples a neural retriever with a generative language model, grounding output in up-to-date, non-parametric memory while retaining the semantic generalisation stored in model weights. Guided by the PRISMA 2020 framework, we (i) specify explicit inclusion and exclusion criteria based on citation count and research questions, (ii) catalogue datasets, architectures, and evaluation practices, and (iii) synthesise empirical evidence on the effectiveness and limitations of RAG. To mitigate citation-lag bias, we applied a lower citation-count threshold to papers published in 2025 so that emerging breakthroughs with naturally fewer citations were still captured. This review clarifies the current research landscape, highlights methodological gaps, and charts priority directions for future research.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Digital Libraries
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Measuring academic influence: Not all citations are equal
R.I.P.
๐ป
Ghosted
The Open Access Advantage Considering Citation, Article Usage and Social Media Attention
R.I.P.
๐ป
Ghosted
A Bibliometric Review of Large Language Models Research from 2017 to 2023
R.I.P.
๐ป
Ghosted
On the Performance of Hybrid Search Strategies for Systematic Literature Reviews in Software Engineering
R.I.P.
๐ป
Ghosted
A Systematic Identification and Analysis of Scientists on Twitter
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted