i-Sim2Real: Reinforcement Learning of Robotic Policies in Tight Human-Robot Interaction Loops
July 14, 2022 ยท Declared Dead ยท ๐ Conference on Robot Learning
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Saminda Abeyruwan, Laura Graesser, David B. D'Ambrosio, Avi Singh, Anish Shankar, Alex Bewley, Deepali Jain, Krzysztof Choromanski, Pannag R. Sanketi
arXiv ID
2207.06572
Category
cs.RO: Robotics
Citations
66
Venue
Conference on Robot Learning
Last Checked
2 months ago
Abstract
Sim-to-real transfer is a powerful paradigm for robotic reinforcement learning. The ability to train policies in simulation enables safe exploration and large-scale data collection quickly at low cost. However, prior works in sim-to-real transfer of robotic policies typically do not involve any human-robot interaction because accurately simulating human behavior is an open problem. In this work, our goal is to leverage the power of simulation to train robotic policies that are proficient at interacting with humans upon deployment. But there is a chicken and egg problem -- how to gather examples of a human interacting with a physical robot so as to model human behavior in simulation without already having a robot that is able to interact with a human? Our proposed method, Iterative-Sim-to-Real (i-S2R), attempts to address this. i-S2R bootstraps from a simple model of human behavior and alternates between training in simulation and deploying in the real world. In each iteration, both the human behavior model and the policy are refined. For all training we apply a new evolutionary search algorithm called Blackbox Gradient Sensing (BGS). We evaluate our method on a real world robotic table tennis setting, where the objective for the robot is to play cooperatively with a human player for as long as possible. Table tennis is a high-speed, dynamic task that requires the two players to react quickly to each other's moves, making for a challenging test bed for research on human-robot interaction. We present results on an industrial robotic arm that is able to cooperatively play table tennis with human players, achieving rallies of 22 successive hits on average and 150 at best. Further, for 80% of players, rally lengths are 70% to 175% longer compared to the sim-to-real plus fine-tuning (S2R+FT) baseline. For videos of our system in action, please see https://sites.google.com/view/is2r.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
๐
๐
Old Age
R.I.P.
๐ป
Ghosted
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
R.I.P.
๐ป
Ghosted
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
R.I.P.
๐ป
Ghosted
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
R.I.P.
๐ป
Ghosted
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
R.I.P.
๐ป
Ghosted
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted