One-Shot Learning of Multi-Step Tasks from Observation via Activity Localization in Auxiliary Video

June 29, 2018 · Declared Dead · 🏛 IEEE International Conference on Robotics and Automation

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Wonjoon Goo, Scott Niekum arXiv ID 1806.11244 Category cs.LG: Machine Learning Cross-listed cs.RO, stat.ML Citations 34 Venue IEEE International Conference on Robotics and Automation Last Checked 4 months ago

Abstract

Due to burdensome data requirements, learning from demonstration often falls short of its promise to allow users to quickly and naturally program robots. Demonstrations are inherently ambiguous and incomplete, making correct generalization to unseen situations difficult without a large number of demonstrations in varying conditions. By contrast, humans are often able to learn complex tasks from a single demonstration (typically observations without action labels) by leveraging context learned over a lifetime. Inspired by this capability, our goal is to enable robots to perform one-shot learning of multi-step tasks from observation by leveraging auxiliary video data as context. Our primary contribution is a novel system that achieves this goal by: (1) using a single user-segmented demonstration to define the primitive actions that comprise a task, (2) localizing additional examples of these actions in unsegmented auxiliary videos via a metalearning-based approach, (3) using these additional examples to learn a reward function for each action, and (4) performing reinforcement learning on top of the inferred reward functions to learn action policies that can be combined to accomplish the task. We empirically demonstrate that a robot can learn multi-step tasks more effectively when provided auxiliary video, and that performance greatly improves when localizing individual actions, compared to learning from unsegmented videos.