Date: 2017
Journal: CVPR
Point clouds or meshes are not in a regular format, This cause the need for transformation to 3D voxel grids or collection of images
This data representation transformation renders the resulting data unnecessarily voluminous
PointNets simply use point clouds
As point cloud is just a set of points, basic architecture is simple at the initial stages each point is processed identically and independently
PointNet s trained to perform 3D shape classification, shape part segmentation and scene semantic parsing tasks
Point Feature encode certain statistical transformation, typically classified, and also be categorized as local and global features
One recent work used a read process write network with attention mechanism to consume unordered input sets
For object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud
Input is a subset of points from an Euclidean space
Unordered, interaction among nearby points, invariant to certain transformation for learned representation of the point set
Three key modules, max pooling layer as a symmetric function to aggregate information, local and global information combination structure, two joint alignment networks
Sorting does not fully resolve the ordering issue
MLP performs better with unsorted point set
Using randomly permuted sequences, RNN become invariant to input order
However when it comes to RNN, order does matter and cannot be totally omitted
Approximate a general function defined on a point set by applying a symmetric function on fransformed elements in the set
Due to simplicity of our module, theoretical analysis were possible
Point segmentation requires a combination of local and global knowledge
After computing the global point cloud feature vector, feed it back to per point feature
Extract new per point features based on the combined point features
Predict affine transformation matrix by a mini network and directly apply this transformation to coordinates of input points
The mini network itself resembles big network and is composed by basic modules of point independent feature extraction, max pooling and fully connected layers
Transformation matrix in the feature space has much higher dimension than the spatial transform matrix
Therefore add a regularization term to our softmax training loss
Ability of neural network to continuous set functions
Given enough neurons at max pooling layer
Suppose is a continuous set function with reference to Hausdorff distance
In the worst case the network can learn to convert a point cloud into a volumetric representation by partitioning the space into equal sized voxels
Expressiveness of network is strongly affected by the dimension of the max pooling layer
Defined sub network of which maps a point set in to a dimensional vector
Following proposed formula
Extra noise points up to
Robustness is gained in analogy to sparsity principle
Intuitively network learns to summarize a shape by a sparse set of key points