Recent research
Research articles
Fast Class-Agnostic Salient Object Segmentation
In 2022, we launched a new systemwide capability that allows users to automatically and instantly lift the subject from an image or isolate the subject by removing the background. This feature is integrated across iOS, macOS, iPadOS and accessible in several apps like Photos, Preview, Safari, Keynote, and more. Underlying this feature is an on-device deep neural network that performs real-time salient object segmentation — or categorizes each pixel of an image as either a part of the foreground or background. Each pixel is assigned a score, denoting how likely it is to be part of the foreground. While prior methods often restrict this process to a fixed set of semantic categories (like people and pets), we designed our model to be unrestricted and generalize to arbitrary classes of subjects (for example, furniture, apparel, collectibles) — including ones it hasn’t encountered during training. While this is an active area of research in Computer Vision, there are many unique challenges that arise when considering this problem within the constraints of a product ready to be used by consumers. This year, we are launching Live Stickers in iOS and iPadOS, as seen in Figure 1, where static and animated sticker creation are built on the technology discussed in this article. In the following sections, we’ll explore some of these challenges and how we approached them.
Improved Speech Recognition for People Who Stutter
Speech recognition systems have improved substantially in recent years, leading to widespread adoption across computing platforms. Two common forms of speech interaction are voice assistants (VAs) that listen for spoken commands and respond accordingly, and dictation systems, which act as an alternative to a keyboard by converting the user's open-ended speech to written text for messages, emails, and so on. Speech interaction is especially important for devices with smaller or no screens, such as smart speakers and smart headphones, that support speech interaction. Yet speech presents barriers for many people with communication disabilities such as stuttering, dysarthria, or aphasia.
All events
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023
Apple is sponsoring the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), which will take place in person from June 18 to 22 in Vancouver, Canada. CVPR is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. Below is the schedule of Apple sponsored workshops and events at CVPR 2023.
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023
Apple is sponsoring the International Conference on Acoustics, Speech and Signal Processing (ICASSP), which will take place in person from June 4 - 10 in Rhodes Island, Greece. ICASSP is the IEEE Signal Processing Society's flagship conference on signal processing and its applications. Below is the schedule of Apple sponsored workshops and events at ICASSP 2023.
