Toward Supporting Quality Alt Text in Computing Publications
AuthorsCandace Williams, Lilian de Greef, Ed Harris, Leah Findlater, Amy Pavel, Cynthia Bennett
Toward Supporting Quality Alt Text in Computing Publications
AuthorsCandace Williams, Lilian de Greef, Ed Harris, Leah Findlater, Amy Pavel, Cynthia Bennett
While researchers have examined alternative (alt) text for social media and news contexts, few have studied the status and challenges for authoring alt text of figures in computing-related publications. These figures are distinct, often conveying dense visual information, and may necessitate unique accessibility solutions. Accordingly, we explored how to support authors in creating alt text in computing publications---specifically in the field of human-computer interaction (HCI). We conducted two studies: (1) an analysis of 300 recently published figures at a general HCI conference (ACM CHI), and (2) interviews with 10 researchers in HCI and related fields who have varying levels of experience writing alt text. Our findings characterize the prevalence, quality, and patterns of recent figure alt text and captions. We further identify challenges authors encounter, describing their workflow barriers and confusions around how to compose alt text for complex figures. We conclude by outlining a research agenda on process, education, and tooling opportunities to improve alt text in computing-related publications.
Closing the Gap Between Text and Speech Understanding in LLMs
February 25, 2026research area Speech and Natural Language Processingconference ICLR
Large Language Models (LLMs) can be adapted to extend their text capabilities to speech inputs. However, these speech-adapted LLMs consistently underperform their text-based counterparts—and even cascaded pipelines—on language understanding tasks. We term this shortfall the text-speech understanding gap: the performance drop observed when a speech-adapted LLM processes spoken inputs relative to when the original text-based LLM processes the…
Enhancing Paragraph Generation with a Latent Language Diffusion Model
March 15, 2024research area Methods and Algorithms
In the fast-evolving world of natural language processing (NLP), there is a strong demand for generating coherent and controlled text, as referenced in the work Toward Controlled Generation of Text. Traditional autoregressive models such as GPT, which have long been the industry standard, possess inherent limitations that sometimes manifest as repetitive and low-quality outputs, as seen in the work The Curious Case of Neural Text Degeneration. This is primarily due to a phenomenon known as “exposure bias,” as seen in the work Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks. This imperfection arises due to a mismatch between how these models are trained and their actual use during inference, often leading to error accumulation during text generation.