EMBridge: Enhancing Gesture Generalization from EMG Signals through Cross-Modal Representation Learning
AuthorsWenhui Cui†**, Christopher M. Sandino, Hadi Pouransar, Ran Liu, Juri Minxha, Ellen L. Zippi, Erdrin Azemi, Behrooz Mahasseni
EMBridge: Enhancing Gesture Generalization from EMG Signals through Cross-Modal Representation Learning
AuthorsWenhui Cui†**, Christopher M. Sandino, Hadi Pouransar, Ran Liu, Juri Minxha, Ellen L. Zippi, Erdrin Azemi, Behrooz Mahasseni
Hand gesture classification using high-quality structured data such as videos, im- ages, and hand skeletons is a well-explored problem in computer vision. Alterna- tively, leveraging low-power, cost-effective bio-signals, e.g., surface electromyo- graphy (sEMG), allows for continuous gesture prediction on wearable devices. In this work, we aim to enhance EMG representation quality by aligning it with embeddings obtained from structured, high-quality modalities that provide richer semantic guidance, ultimately enabling zero-shot gesture generalization. Specif- ically, we propose EMBridge, a cross-modal representation learning framework that bridges the modality gap between EMG and pose. EMBridge learns high- quality EMG representations by introducing a Querying Transformer (Q-Former), a masked pose reconstruction loss, and a community-aware soft contrastive learn- ing objective that aligns the relative geometry of the embedding spaces. We eval- uate EMBridge on both in-distribution and unseen gesture classification tasks and demonstrate consistent performance gains over all baselines. To the best of our knowledge, EMBridge is the first cross-modal representation learning framework to achieve zero-shot gesture classification from wearable EMG signals, showing potential toward real-world gesture recognition on wearable devices.
CPEP: Contrastive Pose-EMG Pre-training Enhances Gesture Generalization on EMG Signals
October 16, 2025research area Human-Computer Interaction, research area Methods and AlgorithmsWorkshop at NeurIPS
This paper was accepted at the Foundation Models for the Brain and Body Workshop at NeurIPS 2025.
Hand gesture classification using high-quality structured data such as videos, images, and hand skeletons is a well-explored problem in computer vision. Leveraging low-power, cost-effective biosignals, e.g. surface electromyography (sEMG), allows for continuous gesture prediction on wearables. In this paper, we demonstrate that learning…
Vision-Based Hand Gesture Customization from a Single Demonstration
March 11, 2024research area Computer Vision, research area Human-Computer Interaction
Hand gesture recognition is becoming a more prevalent mode of human-computer interaction, especially as cameras proliferate across everyday devices. Despite continued progress in this field, gesture customization is often underexplored. Customization is crucial since it enables users to define and demonstrate gestures that are more natural, memorable, and accessible. However, customization requires efficient usage of user-provided data. We…