Skill-Enhanced Reinforcement Learning Acceleration from Heterogeneous Demonstrations

December 09, 2024 · Declared Dead · 🏛 European Conference on Artificial Intelligence

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Hanping Zhang, Yuhong Guo arXiv ID 2412.06207 Category cs.LG: Machine Learning Cross-listed cs.AI Citations 0 Venue European Conference on Artificial Intelligence Last Checked 3 months ago

Abstract

Learning from Demonstration (LfD) is a well-established problem in Reinforcement Learning (RL), which aims to facilitate rapid RL by leveraging expert demonstrations to pre-train the RL agent. However, the limited availability of expert demonstration data often hinders its ability to effectively aid downstream RL learning. To address this problem, we propose a novel two-stage method dubbed as Skill-enhanced Reinforcement Learning Acceleration (SeRLA). SeRLA introduces a skill-level adversarial Positive-Unlabeled (PU) learning model that extracts useful skill prior knowledge by learning from both expert demonstrations and general low-cost demonstrations in the offline prior learning stage. Building on this, it employs a skill-based soft actor-critic algorithm to leverage the acquired priors for efficient training of a skill policy network in the downstream online RL stage. In addition, we propose a simple skill-level data enhancement technique to mitigate data sparsity and further improve both skill prior learning and skill policy training. Experiments across multiple standard RL benchmarks demonstrate that SeRLA achieves state-of-the-art performance in accelerating reinforcement learning on downstream tasks, particularly in the early training phase.