View publication

Human following serves an important human-robotics interaction feature, while real-world scenarios make it challenging particularly for a mobile agent. The main challenge is that when a mobile agent try to locate and follow a targeted person, this person can be in a crowd, be occluded by other people, and/or be facing (partially) away from the mobile agent. To address the challenge, we present a novel person re-identification module, which contains three parts: 1) a 360-degree visual registration process, 2) a neural-based person re-identification mechanism by multiple body parts - human faces and torsos, and 3) a motion model that records human's motion and predicts human's future position. In addition to the person re-idenfication module, our human-following system also tackles other challenges, such as 1) the targeted person can be fast-moving (the need of the system running at a low latency for person identification), 2) the targeted person can move out of the camera sight (the need of searching for the person without sight), and 3) collision avoidance (the need of avoiding hitting obstacles). Through extensive experiments, we observe that our proposed person re-identification module greatly improve human-following feature when compared to other baseline variants.

Related readings and updates.

Randomized Algorithms for Precise Measurement of Differentially-private, Personalized Recommendations

This paper was accepted at the 5th AAAI Workshop on Privacy-Preserving Artificial Intelligence. Personalized recommendations form an important part of today's internet ecosystem, helping artists and creators to reach interested users, and helping users to discover new and engaging content. However, many users today are skeptical of platforms that personalize recommendations, in part due to historically careless treatment of personal data and data…
See paper details

Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis

Adapting generic speech recognition models to specific individuals is a challenging problem due to the scarcity of personalized data. Recent works have proposed boosting the amount of training data using personalized text-to-speech synthesis. Here, we ask two fundamental questions about this strategy: when is synthetic data effective for personalization, and why is it effective in those cases? To address the first question, we adapt a…
See paper details