A Universal Framework for Offline Serendipity Evaluation in Recommender Systems via Large Language Models
August 25, 2025 Β· Declared Dead Β· π International Conference on Information and Knowledge Management
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yu Tokutake, Kazushi Okamoto, Kei Harada, Atsushi Shibata, Koki Karube
arXiv ID
2508.17571
Category
cs.IR: Information Retrieval
Citations
0
Venue
International Conference on Information and Knowledge Management
Last Checked
4 months ago
Abstract
Serendipity in recommender systems (RSs) has attracted increasing attention as a concept that enhances user satisfaction by presenting unexpected and useful items. However, evaluating serendipitous performance remains challenging because its ground truth is generally unobservable. The existing offline metrics often depend on ambiguous definitions or are tailored to specific datasets and RSs, thereby limiting their generalizability. To address this issue, we propose a universally applicable evaluation framework that leverages large language models (LLMs) known for their extensive knowledge and reasoning capabilities, as evaluators. First, to improve the evaluation performance of the proposed framework, we assessed the serendipity prediction accuracy of LLMs using four different prompt strategies on a dataset containing user-annotated serendipitous ground truth and found that the chain-of-thought prompt achieved the highest accuracy. Next, we re-evaluated the serendipitous performance of both serendipity-oriented and general RSs using the proposed framework on three commonly used real-world datasets, without the ground truth. The results indicated that there was no serendipity-oriented RS that consistently outperformed across all datasets, and even a general RS sometimes achieved higher performance than the serendipity-oriented RS.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Information Retrieval
R.I.P.
π»
Ghosted
π
π
Old Age
Neural Graph Collaborative Filtering
R.I.P.
π»
Ghosted
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
R.I.P.
π»
Ghosted
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
R.I.P.
π
404 Not Found
Graph Neural Networks for Social Recommendation
R.I.P.
π»
Ghosted
Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted