paperMarch 2023

Pointersect: Neural Rendering with Cloud-Ray Intersection

In collaboration with Carnegie Mellon University, The University of British Columbia

AuthorsJen-Hao Rick Chang, Wei-Yu Chen, Anurag Ranjan, Kwang Moo Yi, Oncel Tuzel

Pointersect is a plug-and-play rendering algorithm for unseen point clouds without any per-scene optimization. It takes a point cloud (position and optionally color) as input and directly renders images from novel views. The same model also estimates surface normal and depth; it is differentiable and allows scene editing without any retraining.

All results in the webpage are rendered by the same model, trained only on 48 meshes (shown at the bottom).

Video 1: We render lidar-scanned point clouds using pointersect. Pointersect enables us to move within point clouds captured by lidars, relight, and interact with the virtual scenes directly without retraining. The scenes are very different from the training dataset (which is shown at the bottom of the page) and contain noise in the scanned point clouds. This demonstrates the generalization capability of pointersect. The point clouds are from the ARKitScenes dataset.

Video 3: We compare pointersect on various unseen scenes with state-of-the-art point-cloud rendering methods, including NPBG++[1], Neural Points[2], and screened Poisson surface reconstruction[3]. For each scene, the inputs are 6 low-resolution RGBD images. Pointersect recovers high-frequency geometry (e.g., the mast of the boat) and textures (e.g., the zebra stripes) of the ground truth meshes from the low-resolution point cloud.

Figure 1: Our entire training set. Please see our paper for credits.

References

[1] NPBG++, [link.]

[2] Neural Points, [link.]

[3] Screened Poisson surface reconstruction, [link.]

Related readings and updates.

June 7, 2022research area Computer Vision, research area Methods and Algorithms

Scene analysis is an integral core technology that powers many features and experiences in the Apple ecosystem. From visual content search to powerful memories marking special occasions in one’s life, outputs (or "signals") produced by scene analysis are critical to how users interface with the photos on their devices. Deploying dedicated models for each of these individual features is inefficient as many of these models can benefit from sharing resources. We present how we developed Apple Neural Scene Analyzer (ANSA), a unified backbone to build and maintain scene analysis workflows in production. This was an important step towards enabling Apple to be among the first in the industry to deploy fully client-side scene analysis in 2016.

October 19, 2021research area Computer Vision, research area Methods and Algorithms

Camera (in iOS and iPadOS) relies on a wide range of scene-understanding technologies to develop images. In particular, pixel-level understanding of image content, also known as image segmentation, is behind many of the app's front-and-center features. Person segmentation and depth estimation powers Portrait Mode, which simulates effects like the shallow depth of field and Stage Light. Person and skin segmentation power semantic rendering in group shots of up to four people, optimizing contrast, lighting, and even skin tones for each subject individually. Person, skin, and sky segmentation power Photographic Styles, which creates a personal look for your photos by selectively applying adjustments to the right areas guided by segmentation masks, while preserving skin tones. Sky segmentation and skin segmentation power denoising and sharpening algorithms for better image quality in low-texture regions. Several other features consume image segmentation as an essential input.

Pointersect: Neural Rendering with Cloud-Ray Intersection

References

Related readings and updates.

A Multi-Task Neural Architecture for On-Device Scene Analysis

On-device Panoptic Segmentation for Camera Using Transformers

Discover opportunities in Machine Learning.