Continuous interaction with a smart speaker via low-dimensional embeddings of dynamic hand pose
February 28, 2023 Β· Declared Dead Β· π IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Songpei Xu, Chaitanya Kaul, Xuri Ge, Roderick Murray-Smith
arXiv ID
2302.14566
Category
cs.HC: Human-Computer Interaction
Citations
5
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Last Checked
4 months ago
Abstract
This paper presents a new continuous interaction strategy with visual feedback of hand pose and mid-air gesture recognition and control for a smart music speaker, which utilizes only 2 video frames to recognize gestures. Frame-based hand pose features from MediaPipe Hands, containing 21 landmarks, are embedded into a 2 dimensional pose space by an autoencoder. The corresponding space for interaction with the music content is created by embedding high-dimensional music track profiles to a compatible two-dimensional embedding. A PointNet-based model is then applied to classify gestures which are used to control the device interaction or explore music spaces. By jointly optimising the autoencoder with the classifier, we manage to learn a more useful embedding space for discriminating gestures. We demonstrate the functionality of the system with experienced users selecting different musical moods by varying their hand pose.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Human-Computer Interaction
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Improving fairness in machine learning systems: What do industry practitioners need?
R.I.P.
π»
Ghosted
Identifying Stable Patterns over Time for Emotion Recognition from EEG
R.I.P.
π»
Ghosted
Questioning the AI: Informing Design Practices for Explainable AI User Experiences
R.I.P.
π»
Ghosted
Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges and Opportunities
R.I.P.
π»
Ghosted
Educational data mining and learning analytics: An updated survey
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted