Continuous Control with Coarse-to-fine Reinforcement Learning

July 10, 2024 ยท Entered Twilight ยท ๐Ÿ› Conference on Robot Learning

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

"No code URL or promise found in abstract"
"Code repo scraped from project page (backfill)"

Evidence collected by the PWNC Scanner

Repo contents: README.md, cfgs, conda_env.yml, cqn.py, cqn_dmc.py, cqn_utils.py, dmc.py, drqv2plus.py, logger.py, media, replay_buffer.py, replay_buffer_dmc.py, rlbench_env.py, train_dmc.py, train_rlbench.py, train_rlbench_drqv2plus.py, utils.py, video.py

Authors Younggyo Seo, Jafar Uruรง, Stephen James arXiv ID 2407.07787 Category cs.RO: Robotics Cross-listed cs.AI, cs.CV, cs.LG, eess.SY Citations 18 Venue Conference on Robot Learning Repository https://github.com/younggyoseo/CQN โญ 59 Last Checked 1 month ago
Abstract
Despite recent advances in improving the sample-efficiency of reinforcement learning (RL) algorithms, designing an RL algorithm that can be practically deployed in real-world environments remains a challenge. In this paper, we present Coarse-to-fine Reinforcement Learning (CRL), a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner, enabling the use of stable, sample-efficient value-based RL algorithms for fine-grained continuous control tasks. Our key idea is to train agents that output actions by iterating the procedure of (i) discretizing the continuous action space into multiple intervals and (ii) selecting the interval with the highest Q-value to further discretize at the next level. We then introduce a concrete, value-based algorithm within the CRL framework called Coarse-to-fine Q-Network (CQN). Our experiments demonstrate that CQN significantly outperforms RL and behavior cloning baselines on 20 sparsely-rewarded RLBench manipulation tasks with a modest number of environment interactions and expert demonstrations. We also show that CQN robustly learns to solve real-world manipulation tasks within a few minutes of online training.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Robotics