R.I.P.
๐ป
Ghosted
A Survey of Long-Document Retrieval in the PLM and LLM Era
September 09, 2025 ยท The Cartographer ยท ๐ arXiv.org
"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Survey of Long-Document Retrieval in the PLM and LLM Era"
Evidence collected by the PWNC Scanner
Authors
Minghan Li, Miyang Luo, Tianrui Lv, Yishuai Zhang, Siqi Zhao, Ercong Nie, Guodong Zhou
arXiv ID
2509.07759
Category
cs.IR: Information Retrieval
Citations
3
Venue
arXiv.org
Last Checked
4 days ago
Abstract
The proliferation of long-form documents presents a fundamental challenge to information retrieval (IR), as their length, dispersed evidence, and complex structures demand specialized methods beyond standard passage-level techniques. This survey provides the first comprehensive treatment of long-document retrieval (LDR), consolidating methods, challenges, and applications across three major eras. We systematize the evolution from classical lexical and early neural models to modern pre-trained (PLM) and large language models (LLMs), covering key paradigms like passage aggregation, hierarchical encoding, efficient attention, and the latest LLM-driven re-ranking and retrieval techniques. Beyond the models, we review domain-specific applications, specialized evaluation resources, and outline critical open challenges such as efficiency trade-offs, multimodal alignment, and faithfulness. This survey aims to provide both a consolidated reference and a forward-looking agenda for advancing long-document retrieval in the era of foundation models.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Information Retrieval
๐
๐
Old Age
Neural Graph Collaborative Filtering
R.I.P.
๐ป
Ghosted
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
R.I.P.
๐ป
Ghosted
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
R.I.P.
๐
404 Not Found
Graph Neural Networks for Social Recommendation
R.I.P.
๐ป
Ghosted