The majority of extant Generation of 3D data works resort to regular represenations such as volumetric grids or collection of images
However, these representations obscure the natural invariance of 3D shapes under geometric transformation ,and also suffer from a number of other issues.
First to study the point set generation problem by deep learning
(Generative networks for 3D geometry based on a point cloud representation.)
Conditional shape sampler, capable of predicting multiple plausible 3D point clouds from an input image.
Principled formulation and solution to address the groundtruth ambiguity issue for the 3D reconstruction from single image task.
building a conditional generative network for point sets is challenging, due to unordered form of representation and the inherent ambiguity of groundtruth.
This paper addresses 3 subproblems
1) Point set generator architecture
network with two prediction branches, one enjoys high flexibility in capturing complicated structures and the other exploits geomoetric continuity
2) Loss function for point set comparison
how to measure the distance between the prediction and groundtruth.
This paper introduces two distance metrics for point sets
-> the Chamfer distance(CD) & the Earth Mover's distance(EMD)
Both metrics are differentiable almost eveywhere and can be used as the loss function.
3) Modeling the uncertainty of groundtruth
Simply Using min function as a wrapper to the above proposed loss, or by a conditional variational autoencoder, Characterizing the ambiguity of groundtruth for a given input, and practically Generating multiple predictions are achieved.
Network for point set prediction has an encoder stage and a predictor stage.
1-1) Vanila version
Encoder
-> Encoder maps the input pair of an image I and a random vector r into an embedding space.
-> Composition of convolution and ReLU layers.
-> Random vector r is subsumed so that it perturbs the prediction from the image I.
Predictor
-> Predictor outputs a shaps matrix M (N x 3), each row containing the coordinates of one point.
-> Predictor generates the coordinats of N points through a fully connected network.
1-2) Two prediction branch version: Predictor branch is improved to better accomodate large and smooth sufaces.
Fully connected predictor as above can't make full use of natural geometric statics, since each point is predicted independently.
This version has two parallel predictor branches
fully-connected branch (fc) & deconvolution branch(deconv).
fc branch
deconv branch
Their predictions are later merged together to form the whole set of points in M.
(Multiple skip links are added to boost information flow across encoder and predictor)
1-3) Hourglass version: Persuing better performance
๐ค still needs to design a proper loss function for point set prediction, and enable the role r for multiple candidates prediction.
Chamfer distance
Earth Mover's distance
Shape space
Effect of combining deconv and fc branches for reconstruction
If you're working with 3D object reconstruction, having access to a wide variety of object references can boost your model's robustness and creativity. Tools like https://randomobjectgenerators.com/ offer instant access to diverse, everyday objects that can serve as useful inputs or inspiration when training or evaluating your network. Itโs a practical way to expand your dataset ideas with minimal effort.
์ข์ ๊ธ ์ ์ฝ์์ต๋๋ค, ๊ฐ์ฌํฉ๋๋ค.