View publication

*=Equal Contributors

Providing voice assistants the ability to navigate multi-turn conversations is a challenging problem. Handling multi-turn interactions requires the system to understand various conversational use-cases, such as steering, intent carryover, disfluencies, entity carryover, and repair. The complexity of this problem is compounded by the fact that these use-cases mix with each other, often appearing simultaneously in natural language. This work proposes a non-autoregressive query rewriting architecture that can handle not only the five aforementioned tasks, but also complex compositions of these use-cases. We show that our proposed model has competitive single task performance compared to the baseline approach, and even outperforms a fine-tuned T5 model in use-case compositions, despite being 15 times smaller in parameters and 25 times faster in latency.

Related readings and updates.

STEER: Semantic Turn Extension-Expansion Recognition for Voice Assistants

*= Equal Contributors In the context of a voice assistant system, steering refers to the phenomenon in which a user issues a follow-up command attempting to direct or clarify a previous turn. We propose STEER, a steering detection model that predicts whether a follow-up turn is a user's attempt to steer the previous command. Constructing a training dataset for steering use cases poses challenges due to the cold-start problem. To overcome this, we…
See paper details

Interspeech Conference 2023

Apple sponsored the Interspeech Conference, which took place in person from August 20 to 24 in Dublin, Ireland. Interspeech is a conference on the science and technology of spoken language processing. Below was the schedule of Apple-sponsored workshops and events at the conference.

See event details