Reinforcement Learning to Rank Using Coarse-grained Rewards
August 16, 2022 Β· Declared Dead Β· + Add venue
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yiteng Tu, Zhichao Xu, Tao Yang, Weihang Su, Yujia Zhou, Yiqun Liu, Fen Lin, Qin Liu, Qingyao Ai
arXiv ID
2208.07563
Category
cs.IR: Information Retrieval
Citations
6
Last Checked
4 months ago
Abstract
Learning to rank (LTR) plays a crucial role in various Information Retrieval (IR) tasks. Although supervised LTR methods based on fine-grained relevance labels (e.g., document-level annotations) have achieved significant success, their reliance on costly and potentially biased annotations limits scalability and alignment with realistic goals. In contrast, coarse-grained feedback signals, such as duration time and session-level engagement, are more accessible and affordable. Reinforcement Learning (RL) offers a promising framework to directly optimize these objectives using reward signals, but most existing Reinforcement Learning to Rank (RLTR) approaches suffer from high variance and low sample efficiency. Motivated by recent advances in large language models (LLMs), we re-examine the problem of RLTR with coarse-grained rewards and propose new RLTR methods based on widely used RL algorithms for LLMs. We systematically compare supervised learning and RL-based methods across various model architectures and coarse-grained reward functions on large-scale LTR benchmarks. Experimental results demonstrate that advanced RL methods can directly learn from coarse-grained rewards and outperform strong supervised learning baselines even with fine-grained labels. This shows the great potential of RLTR for metric-agnostic ranking optimization.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Information Retrieval
R.I.P.
π»
Ghosted
π
π
Old Age
Neural Graph Collaborative Filtering
R.I.P.
π»
Ghosted
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
R.I.P.
π»
Ghosted
BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
R.I.P.
π
404 Not Found
Graph Neural Networks for Social Recommendation
R.I.P.
π»
Ghosted
Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted