Multimedia-Aware Question Answering: A Review of Retrieval and Cross-Modal Reasoning Architectures
October 23, 2025 ยท The Cartographer ยท ๐ Proceedings of the 2nd ACM Workshop in AI-powered Question & Answering Systems
"No code URL or promise found in abstract"
"Title-pattern auto-detect: Multimedia-Aware Question Answering: A Review of Retrieval and Cross-Modal Reasoning Architectures"
Evidence collected by the PWNC Scanner
Authors
Rahul Raja, Arpita Vats
arXiv ID
2510.20193
Category
cs.IR: Information Retrieval
Cross-listed
cs.CL,
cs.CV,
cs.LG
Citations
1
Venue
Proceedings of the 2nd ACM Workshop in AI-powered Question & Answering Systems
Last Checked
4 days ago
Abstract
Question Answering (QA) systems have traditionally relied on structured text data, but the rapid growth of multimedia content (images, audio, video, and structured metadata) has introduced new challenges and opportunities for retrieval-augmented QA. In this survey, we review recent advancements in QA systems that integrate multimedia retrieval pipelines, focusing on architectures that align vision, language, and audio modalities with user queries. We categorize approaches based on retrieval methods, fusion techniques, and answer generation strategies, and analyze benchmark datasets, evaluation protocols, and performance tradeoffs. Furthermore, we highlight key challenges such as cross-modal alignment, latency-accuracy tradeoffs, and semantic grounding, and outline open problems and future research directions for building more robust and context-aware QA systems leveraging multimedia data.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Information Retrieval
R.I.P.
๐ป
Ghosted
๐
๐
Old Age
Neural Graph Collaborative Filtering
R.I.P.
๐ป
Ghosted
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
R.I.P.
๐ป
Ghosted
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
R.I.P.
๐
404 Not Found
Graph Neural Networks for Social Recommendation
R.I.P.
๐ป
Ghosted