Toward Supporting Quality Alt Text in Computing Publications

AuthorsCandace Williams, Lilian de Greef, Ed Harris, Leah Findlater, Amy Pavel, Cynthia Bennett

While researchers have examined alternative (alt) text for social media and news contexts, few have studied the status and challenges for authoring alt text of figures in computing-related publications. These figures are distinct, often conveying dense visual information, and may necessitate unique accessibility solutions. Accordingly, we explored how to support authors in creating alt text in computing publications---specifically in the field of human-computer interaction (HCI). We conducted two studies: (1) an analysis of 300 recently published figures at a general HCI conference (ACM CHI), and (2) interviews with 10 researchers in HCI and related fields who have varying levels of experience writing alt text. Our findings characterize the prevalence, quality, and patterns of recent figure alt text and captions. We further identify challenges authors encounter, describing their workflow barriers and confusions around how to compose alt text for complex figures. We conclude by outlining a research agenda on process, education, and tooling opportunities to improve alt text in computing-related publications.

Related readings and updates.

March 15, 2024research area Methods and Algorithms

In the fast-evolving world of natural language processing (NLP), there is a strong demand for generating coherent and controlled text, as referenced in the work Toward Controlled Generation of Text. Traditional autoregressive models such as GPT, which have long been the industry standard, possess inherent limitations that sometimes manifest as repetitive and low-quality outputs, as seen in the work The Curious Case of Neural Text Degeneration. This is primarily due to a phenomenon known as "exposure bias," as seen in the work Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks. This imperfection arises due to a mismatch between how these models are trained and their actual use during inference, often leading to error accumulation during text generation.

June 12, 2023research area Speech and Natural Language Processingconference Interspeech

Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved. In this paper, we propose an audio-text-based end-to-end model architecture for flexible keyword spotting (KWS), which builds upon learned audio and text embeddings. Our architecture uses a novel...

Toward Supporting Quality Alt Text in Computing Publications

Related readings and updates.

Enhancing Paragraph Generation with a Latent Language Diffusion Model

Matching Latent Encoding for Audio-Text based Keyword Spotting

Discover opportunities in Machine Learning.