Unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology.
Many of the SOTA learning based 3D reconstruction approaches can only represent very coarse 3D geometry or are limited to a restricted domain.
While generative models have recently achieved remarkable successes in generating realistic high resolution images, this success has not yet been replicated in the 3D domain.
3D output representation is neither memory efficient nor be efficiently inferred fromdata.
➡️ This paper proposes an approach to 3D reconsturction based on directly learning the continuous 3D occupancy function. (Fig 1d)
This paper proposes Occupancy Networks, a new representation for learning-based 3D reconstruction methods.
It implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier.
In contrast to existing approaches, this representation encodes a description of the 3D ouput at infinite resolution without excessing memory footprint.
Occupancy function: This paper calls Occupancy function which is a resulting function that reasons about the occupancy at every possible 3D point p ∈ R^3.
Key insight is that approximating this 3D function with a NN that assigns to every location p ∈ R^3 an occupancy probability between 0 and 1.
(Equivalent to a NN for binary classification, except being interested in the decision boundary which implicitly represents the object's surface.)
For extracting the isosurface corresponding to a new observation given a trained occupancy network, this paper introduces MISE(Multiresolution IsoSurface Extraction), a hierarchical isosurface extraction algorithm (Fig 2).
MISE enables to extract high resolution meshes from the occupancy network without densely evaluating all points of a high-dimensional occupancy grid.
Onet = Occupancy Network
This paper measures the volumetric IoU to the ground truth mesh.
Onet represents high mean IoU of 9.89
(while a low-reolution voxel representation is not able to represent the meshes accurately.)
Onet is able to encode all training samples with as little as 6M parameters, independently of the resolution.
(in contrast, memeory requirements of a voxel representation grow cubically with resolution.)
Occupancy network represents details of the 3D geometry which are lost in a low0resolution voexlization.
This paper introduced occupancy networks, a new representation for 3D geometry,
which is not constrained by the discretization of the 3D sapce and can hence be used to represent realistic high-resolution meshes.
This paper showed expressiveness and effectiveness of occupacny networks both for supervised and unsupervised learning through experiments.