PointNet 논문 리뷰

박민서·2023년 9월 5일

딥러닝 공부

목록 보기

6/9

PointNet: Deep Learning on Point for 3D Classification and Segmentation

Point cloud는 irregular, unordered하다. 이러한 특징은 Point cloud데이터를 input으로 해서 딥러닝에 적용시키는 것을 challing하게 한다. 기존 연구에서는 데이터를 균일한 3d voxel로 바꾸거나 collections of images들로 바꾸는 작업을 하였으나, 이는 데이터를 불필요하게 부피가 크게 만들고 여러 이슈를 야기한다는 단점이 있다.
본 논문에서는 raw point cloud를 input으로 취하면서도 Point cloud의 permutation invarieance를 유지할 수 있는 네트워크인 'PointNet'을 고안하였다.
'PointNet'은 object classification, part segmentation부터 scene semantic parsing까지 가능한 unified architecture이다.

input을 point cloud, output은 input 점들에 대한 class label이나 segmentation label, 등으로 처리한다는 점에서 unified model입니다. (다루는 data의 타입이 바뀌진 않는다.)
기본적인 구조는 간단한데, 각 layer의 역할은 점들을 모두 동일하게(identical) 처리하거나 독립적(independent)으로 처리하거나 둘 중 하나를 맡습니다.
max pooling을 사용합니다. interesting 또는 informative point를 고르는 기준을 학습하고 그 기준에 대한 근거를 encode 하게 됩니다.
FC layer을 사용합니다. max pooling 결과로 나온 값들을 모아서 global descriptor로 만듭니다. 이 descriptor는 entire shape이나 per point label에 쓰입니다.
data-dependent STN(spatial transformer network)를 PointNet에 넣기 전에 추가해서 성능을 좀 더 높였습니다. 모델 구조를 수식으로 표현했을 때, 연속인 set function을 approximate 할 수 있는 모델이다. 즉, point cloud를 다루는 것에 적합하다고 설명한다.
네트워크는 input point cloud를 sparse set of key points(=시각화했을 때 skeleton에 해당하는 점들)로 summarize 하는 것을 학습한다. (max pooling 때문인 것 같다.) 뒤에서 나올 upper bound shape point나 criticial point들이 그러하다.
PointNet이 outlier나 data missing 같은 input point의 perturbation(작은 변화)에도 강하다.