Boosting Feedback Efficiency of Interactive Reinforcement Learning by Adaptive Learning from Scores
July 11, 2023 ยท Entered Twilight ยท ๐ IEEE/RJS International Conference on Intelligent RObots and Systems
Repo contents: GUI_screenshot.png, LICENSE.md, README.md, config, custom_env.py, main.py, model.py, pure_sac_train.py, rate_trajectory.py, rate_trajectory.ui, rate_window.py, replay_memory.py, requirements.txt, reward_net.py, sac.py, utils.py, wandb_to_plot.py
Authors
Shukai Liu, Chenming Wu, Ying Li, Liangjun Zhang
arXiv ID
2307.05405
Category
cs.RO: Robotics
Cross-listed
cs.LG
Citations
2
Venue
IEEE/RJS International Conference on Intelligent RObots and Systems
Repository
https://github.com/SSKKai/Interactive-Scoring-IRL
โญ 5
Last Checked
2 months ago
Abstract
Interactive reinforcement learning has shown promise in learning complex robotic tasks. However, the process can be human-intensive due to the requirement of a large amount of interactive feedback. This paper presents a new method that uses scores provided by humans instead of pairwise preferences to improve the feedback efficiency of interactive reinforcement learning. Our key insight is that scores can yield significantly more data than pairwise preferences. Specifically, we require a teacher to interactively score the full trajectories of an agent to train a behavioral policy in a sparse reward environment. To avoid unstable scores given by humans negatively impacting the training process, we propose an adaptive learning scheme. This enables the learning paradigm to be insensitive to imperfect or unreliable scores. We extensively evaluate our method for robotic locomotion and manipulation tasks. The results show that the proposed method can efficiently learn near-optimal policies by adaptive learning from scores while requiring less feedback compared to pairwise preference learning methods. The source codes are publicly available at https://github.com/SSKKai/Interactive-Scoring-IRL.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
๐
๐
Old Age
R.I.P.
๐ป
Ghosted
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
R.I.P.
๐ป
Ghosted
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
R.I.P.
๐ป
Ghosted
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
R.I.P.
๐ป
Ghosted
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
R.I.P.
๐ป
Ghosted