Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches
October 06, 2025 ยท The Cartographer ยท ๐ arXiv.org
"No code URL or promise found in abstract"
"Title-pattern auto-detect: Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches"
Evidence collected by the PWNC Scanner
Authors
Yicheng Tao, Yao Qin, Yepang Liu
arXiv ID
2510.04905
Category
cs.SE: Software Engineering
Cross-listed
cs.CL
Citations
8
Venue
arXiv.org
Last Checked
3 days ago
Abstract
Recent advancements in large language models (LLMs) have substantially improved automated code generation. While function-level and file-level generation have achieved promising results, real-world software development typically requires reasoning across entire repositories. This gives rise to the challenging task of Repository-Level Code Generation (RLCG), where models must capture long-range dependencies, ensure global semantic consistency, and generate coherent code spanning multiple files or modules. To address these challenges, Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm that integrates external retrieval mechanisms with LLMs, enhancing context-awareness and scalability. In this survey, we provide a comprehensive review of research on Retrieval-Augmented Code Generation (RACG), with an emphasis on repository-level approaches. We categorize existing work along several dimensions, including generation strategies, retrieval modalities, model architectures, training paradigms, and evaluation protocols. Furthermore, we summarize widely used datasets and benchmarks, analyze current limitations, and outline key challenges and opportunities for future research. Our goal is to establish a unified analytical framework for understanding this rapidly evolving field and to inspire continued progress in AI-powered software engineering.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Software Engineering
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Microservices: yesterday, today, and tomorrow
๐
๐
The Cartographer
A Survey of Machine Learning for Big Code and Naturalness
R.I.P.
๐ป
Ghosted
An Overview on Smart Contracts: Challenges, Advances and Platforms
R.I.P.
๐ป
Ghosted
Slither: A Static Analysis Framework For Smart Contracts
R.I.P.
๐ป
Ghosted