IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition
September 18, 2024 Β· Declared Dead Β· π IEEE International Conference on Robotics and Automation
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Rui Liu, Zahiruddin Mahammad, Amisha Bhaskar, Pratap Tokekar
arXiv ID
2409.12092
Category
cs.RO: Robotics
Cross-listed
cs.AI
Citations
9
Venue
IEEE International Conference on Robotics and Automation
Last Checked
4 months ago
Abstract
Robotic assistive feeding holds significant promise for improving the quality of life for individuals with eating disabilities. However, acquiring diverse food items under varying conditions and generalizing to unseen food presents unique challenges. Existing methods that rely on surface-level geometric information (e.g., bounding box and pose) derived from visual cues (e.g., color, shape, and texture) often lacks adaptability and robustness, especially when foods share similar physical properties but differ in visual appearance. We employ imitation learning (IL) to learn a policy for food acquisition. Existing methods employ IL or Reinforcement Learning (RL) to learn a policy based on off-the-shelf image encoders such as ResNet-50. However, such representations are not robust and struggle to generalize across diverse acquisition scenarios. To address these limitations, we propose a novel approach, IMRL (Integrated Multi-Dimensional Representation Learning), which integrates visual, physical, temporal, and geometric representations to enhance the robustness and generalizability of IL for food acquisition. Our approach captures food types and physical properties (e.g., solid, semi-solid, granular, liquid, and mixture), models temporal dynamics of acquisition actions, and introduces geometric information to determine optimal scooping points and assess bowl fullness. IMRL enables IL to adaptively adjust scooping strategies based on context, improving the robot's capability to handle diverse food acquisition scenarios. Experiments on a real robot demonstrate our approach's robustness and adaptability across various foods and bowl configurations, including zero-shot generalization to unseen settings. Our approach achieves improvement up to $35\%$ in success rate compared with the best-performing baseline. More details can be found on our website https://ruiiu.github.io/imrl.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Robotics
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles
π
π
The Cartographer
A Survey of Motion Planning and Control Techniques for Self-driving Urban Vehicles
π
π
The Cartographer
Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges
π
π
The Cartographer
A Survey of Autonomous Driving: Common Practices and Emerging Technologies
R.I.P.
π»
Ghosted
Learning agile and dynamic motor skills for legged robots
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted