View publication

Human following serves an important human-robotics interaction feature, while real-world scenarios make it challenging particularly for a mobile agent. The main challenge is that when a mobile agent try to locate and follow a targeted person, this person can be in a crowd, be occluded by other people, and/or be facing (partially) away from the mobile agent. To address the challenge, we present a novel person re-identification module, which contains three parts: 1) a 360-degree visual registration process, 2) a neural-based person re-identification mechanism by multiple body parts - human faces and torsos, and 3) a motion model that records human's motion and predicts human's future position. In addition to the person re-idenfication module, our human-following system also tackles other challenges, such as 1) the targeted person can be fast-moving (the need of the system running at a low latency for person identification), 2) the targeted person can move out of the camera sight (the need of searching for the person without sight), and 3) collision avoidance (the need of avoiding hitting obstacles). Through extensive experiments, we observe that our proposed person re-identification module greatly improve human-following feature when compared to other baseline variants.

Related readings and updates.

Randomized Algorithms for Precise Measurement of Differentially-private, Personalized Recommendations

This paper was accepted at The 5th AAAI Workshop on Privacy-Preserving Artificial Intelligence. Personalized recommendations form an important part of today's internet ecosystem, helping artists and creators to reach interested users, and helping users to discover new and engaging content. However, many users today are skeptical of platforms that personalize recommendations, in part due to historically careless treatment of personal data and data…
See paper details

Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization

Recent advances in deep learning and automatic speech recognition have boosted the accuracy of end-to-end speech recognition to a new level. However, recognition of personal content, such as contact names, remains a challenge. In this work, we present a personalization solution for an end-to-end system based on connectionist temporal classification. Our solution uses a class-based language model, in which a general language model provides…
See paper details