View publication

Reconstructing an accurate 3D object model from a few image observations remains a challenging problem in computer vision. State-of-the-art approaches typically assume accurate camera poses as input, which could be difficult to obtain in realistic settings. In this paper, we present FvOR, a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses. The core of our approach is a fast and robust multi-view reconstruction algorithm to jointly refine 3D geometry and camera pose estimation using learnable neural network modules. We provide a thorough benchmark of state-of-the-art approaches for this problem on ShapeNet. Our approach achieves best-in-class results. It is also two orders of magnitude faster than the recent optimization-based approach IDR.

Related readings and updates.

High Fidelity 3D Reconstructions with Limited Physical Views

Multi-view triangulation is the gold standard for 3D reconstruction from 2D correspondences, given known calibration and sufficient views. However in practice expensive multi-view setups — involving tens sometimes hundreds of cameras — are required to obtain the high fidelity 3D reconstructions necessary for modern applications. In this work we present a novel approach that leverages recent advances in 2D-3D lifting using neural shape priors…
See paper details

On the Generalization of Learning-based 3D Reconstruction

State-of-the-art learning-based monocular 3D reconstruction methods learn priors over object categories on the training set, and as a result struggle to achieve reasonable generalization to object categories unseen during training. In this paper we study the inductive biases encoded in the model architecture that impact the generalization of learning-based 3D reconstruction methods. We find that 3 inductive biases impact performance: the spatial…
See paper details