[WIP] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Estelle Yoon·2025년 3월 18일

Classification Deep Learning LiDAR Point Cloud segmentation 논문스터디 딥러닝 라이다 분류 세그멘테이션 포인트클라우드

WIP

목록 보기

10/13

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

Date: 2017
Journal: CVPR

1. Introduction

Point clouds or meshes are not in a regular format, This cause the need for transformation to 3D voxel grids or collection of images

This data representation transformation renders the resulting data unnecessarily voluminous

PointNets simply use point clouds

As point cloud is just a set of points, basic architecture is simple at the initial stages each point is processed identically and independently

PointNet s trained to perform 3D shape classification, shape part segmentation and scene semantic parsing tasks

Point Cloud Feature

Point Feature encode certain statistical transformation, typically classified, and also be categorized as local and global features

Deep Learning on 3D Data

Deep Learning on Unordered Sets

One recent work used a read process write network with attention mechanism to consume unordered input sets

3. Problem Statement

For object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud

4. Deep Learning on Point Sets

4.1. Properties of Point Sets in $\R ^n$

Input is a subset of points from an Euclidean space

Unordered, interaction among nearby points, invariant to certain transformation for learned representation of the point set

4.2. PointNet Architecture

Three key modules, max pooling layer as a symmetric function to aggregate information, local and global information combination structure, two joint alignment networks

Symmetry Function for Unordered Input

Sort input into a canonical order

Sorting does not fully resolve the ordering issue

MLP performs better with unsorted point set

Treat input as a sequence to train RNN

Using randomly permuted sequences, RNN become invariant to input order

However when it comes to RNN, order does matter and cannot be totally omitted

Simple symmetric function to aggregate information from each points

Approximate a general function defined on a point set by applying a symmetric function on fransformed elements in the set

Due to simplicity of our module, theoretical analysis were possible

Local and Global Information Aggregation

Point segmentation requires a combination of local and global knowledge

After computing the global point cloud feature vector, feed it back to per point feature

Extract new per point features based on the combined point features

Joint Alignment Network

Predict affine transformation matrix by a mini network and directly apply this transformation to coordinates of input points

The mini network itself resembles big network and is composed by basic modules of point independent feature extraction, max pooling and fully connected layers

Transformation matrix in the feature space has much higher dimension than the spatial transform matrix

Therefore add a regularization term to our softmax training loss

4.2. Theoretical Analysis

Universal approximation

Ability of neural network to continuous set functions

Given enough neurons at max pooling layer

Theorem 1.

Suppose $f : \chi ~ \rarr \R$ is a continuous set function with reference to Hausdorff distance

In the worst case the network can learn to convert a point cloud into a volumetric representation by partitioning the space into equal sized voxels

Bottleneck dimension and stability

Expressiveness of network is strongly affected by the dimension of the max pooling layer

Defined sub network of $f$ which maps a point set in $[0, ~ 1] ^m$ to a $K$ dimensional vector

Theorem 2.

Following proposed formula

Extra noise points up to $\mathcal{N}_S$

Robustness is gained in analogy to sparsity principle

Intuitively network learns to summarize a shape by a sparse set of key points

Estelle Yoon

Studying

이전 포스트

[WIP] Semantic KITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

다음 포스트

[WIP] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

WIP

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

1. Introduction

Point Cloud Feature

Deep Learning on 3D Data

Deep Learning on Unordered Sets

3. Problem Statement

4. Deep Learning on Point Sets

4.1. Properties of Point Sets in $\R ^n$

4.2. PointNet Architecture

Symmetry Function for Unordered Input

Local and Global Information Aggregation

Joint Alignment Network

4.2. Theoretical Analysis

Universal approximation

Theorem 1.

Bottleneck dimension and stability

Theorem 2.

[WIP] Semantic KITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

[WIP] PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

0개의 댓글

[WIP] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

WIP

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

1. Introduction

2. Related Work

Point Cloud Feature

Deep Learning on 3D Data

Deep Learning on Unordered Sets

3. Problem Statement

4. Deep Learning on Point Sets

4.1. Properties of Point Sets in Rn\R ^nRn

4.2. PointNet Architecture

Symmetry Function for Unordered Input

Local and Global Information Aggregation

Joint Alignment Network

4.2. Theoretical Analysis

Universal approximation

Theorem 1.

Bottleneck dimension and stability

Theorem 2.

[WIP] Semantic KITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

[WIP] PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

0개의 댓글

4.1. Properties of Point Sets in $\R ^n$