[딥러닝] Applying the CenterPoint (3D Object Detection algorithm) to the nuScenes dataset

zzwon1212·2024년 1월 16일

Object Detection 딥러닝

딥러닝

목록 보기

15/33

1. Install

1.1. Requirements / (my version)

_ / (AWS - Amazon Web Services)
_ / (Docker - nvcr.io/nvidia/pytorch:21.12-py3)
Linux / (Ubuntu 18.04)
Python 3.6+ / (3.8.12)
PyTorch 1.1 or higher / (1.11)
CUDA 10.0 or higher / (11.5)
CMake 3.13.2 or higher
APEX
spconv

모든 명령어는 기본적으로 Docker container에서 실행한다.

1.2. opencv/opencv-python#591 issue 해결

pip install "opencv-python-headless<4.3"

1.3. Basic Installation

cd /workspace/jiwon
git clone https://github.com/tianweiy/CenterPoint.git
cd CenterPoint
pip install -r requirements.txt

# add CenterPoint to PYTHONPATH by adding the following line to ~/.bashrc
export PYTHONPATH="${PYTHONPATH}:/workspace/jiwon/CenterPoint"

1.4. nuScenes dev-kit

git clone https://github.com/tianweiy/nuscenes-devkit

# add the following line to ~/.bashrc and reactivate bash
export PYTHONPATH="${PYTHONPATH}:/workspace/jiwon/CenterPoint/nuscenes-devkit/python-sdk"

1.5. CUDA Extensions

# set the cuda path
nvcc --version
export PATH=/usr/local/cuda-11.5/bin:$PATH
export CUDA_PATH=/usr/local/cuda-11.5
export CUDA_HOME=/usr/local/cuda-11.5
export LD_LIBRARY_PATH=/usr/local/cuda-11.5/lib64:$LD_LIBRARY_PATH

# Rotated NMS 
cd det3d/ops/iou3d_nms
python setup.py build_ext --inplace

1.6. spconv

# in local, not in Docker container
$ sudo apt-get install libboost-all-dev

# in Docker container
cd /workspace/jiwon/CenterPoint
git clone https://github.com/traveller59/spconv.git --recursive

### issue 1 ###

cd spconv
git checkout 7342772

### issue 2 ###
### issue 3 ###
### issue 4 ###

cd /workspace/jiwon/CenterPoint/spconv && python setup.py bdist_wheel
cd ./dist && pip install *

Solve issue 1 (safe.directory)

git config --global --add safe.directory /home/ubuntu/jiwon/CenterPoint/spconv

Solve issue 2 (CMakeLists.txt)

cd third_party && git clone https://github.com/pybind/pybind11.git
cd pybind11 && git checkout 085a29436a8c472caaaf7157aa644b571079bcaa

Solve issue 3 (Half operands)
/workspace/CenterPoint/spconv/setup.py에서 "-DCMAKE_CUDA_FLAGS"에 "-D__CUDA_NO_HALF_OPERATORS__"를 추가한다.
Solve issue 4 (torch::jit)
/workspace/CenterPoint/spconv/src/spconv/all.cc에서 "torch::jit"을 "torch"로 수정한다.

2. Start

2.1. Data

2.1.1. nuScenes dataset

나는 AWS의 용량 한계로 전체 dataset이 아닌 mini-dataset을 이용해 model을 학습하였다.

2.1.2. Prepare data

Data creation should be under the gpu environment.

# Create a symlink to the dataset root
mkdir /workspace/jiwon/CenterPoint/data
cd /workspace/jiwon/CenterPoint/data
ln -s /workspace/DATA_ROOT
mv DATA_ROOT nuScenes

# Create data
cd /workspace/jiwon/CenterPoint
python tools/create_data.py nuscenes_data_prep --root_path=/workspace/DATA_ROOT --version="v1.0-trainval" --nsweeps=10

In the end, the data and info files should be organized as follows

2.2. Train

python -m torch.distributed.launch --nproc_per_node=1 ./tools/train.py CONFIG_PATH

CONFIG_PATH: e.g. configs/nusc/pp/nusc_centerpoint_pp_02voxel_two_pfn_10sweep.py
model과 log는 work_dirs/CONFIG_NAME에 저장된다.

2.3. Test

2.3.1. Own model

python -m torch.distributed.launch --nproc_per_node=1 ./tools/dist_test.py CONFIG_PATH --work_dir work_dirs/CONFIG_NAME --checkpoint work_dirs/CONFIG_NAME/latest.pth

CONFIG_NAME: e.g. nusc_centerpoint_pp_02voxel_two_pfn_10sweep

2.3.2. Pretrained model

python -m torch.distributed.launch --nproc_per_node=1 ./tools/dist_test.py configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z.py --work_dir work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z --checkpoint work_dirs/nusc_centerpoint_voxelnet_0075voxel_fix_bn_z/pretrained_epoch_20.pth

위 링크에서 pretrained model을 다운로드하여 위 코드처럼 적용할 수 있다.

3. Result

3.1. Metrics

mean Average Precision (mAP): We use the well-known Average Precision metric, but define a match by considering the 2D center distance on the ground plane rather than intersection over union based affinities. Specifically, we match predictions with the ground truth objects that have the smallest center-distance up to a certain threshold. For a given match threshold we calculate average precision (AP) by integrating the recall vs precision curve for recalls and precisions > 0.1. We finally average over match thresholds of {0.5, 1, 2, 4} meters and compute the mean across classes.

True Positive metrics

Average Translation Error (ATE): Euclidean center distance in 2D in meters.

Average Scale Error (ASE): Calculated as $1 - IOU$ after aligning centers and orientation.

Average Orientation Error (AOE): Smallest yaw angle difference between prediction and ground-truth in radians. Orientation error is evaluated at 360 degree for all classes except barriers where it is only evaluated at 180 degrees. Orientation errors for cones are ignored.

Average Velocity Error (AVE): Absolute velocity error in m/s. Velocity error for barriers and cones are ignored.

Average Attribute Error (AAE): Calculated as $1 - acc$ , where acc is the attribute classification accuracy. Attribute error for barriers and cones are ignored.

nuScenes detection score (NDS): We consolidate the above metrics by computing a weighted sum: mAP, mATE, mASE, mAOE, mAVE and mAAE. As a first step we convert the TP errors to TP scores as $TP\_score = max(1 - TP\_error, 0.0)$ . We then assign a weight of $5$ to mAP and $1$ to each of the $5$ TP scores and calculate the normalized sum.

3.2. Table (own model)

mean

metric	value
mAP (mean Average Precision)	0.1269
mATE (mean Average Translation Error)	0.6030
mASE (mean Average Scale Error)	0.5002
mAOE (mean Average Orientation Error)	1.0882
mAVE (mean Velocity Error)	1.3837
mAAE (mean Average Attribute Error)	0.5397
NDS (NuScenes Detection Score)	0.1991
Eval time	5.1s

Per-class

Object class	AP	ATE	ASE	AOE	AVE	AAE
car	0.503	0.310	0.202	0.998	0.640	0.358
truck	0.033	0.492	0.288	0.831	0.824	0.410
bus	0.096	0.736	0.240	0.674	3.614	0.489
trailer	0.000	1.000	1.000	1.000	1.000	1.000
construction_vehicle	0.000	1.000	1.000	1.000	1.000	1.000
pedestrian	0.578	0.306	0.304	1.490	1.110	0.361
motorcycle	0.029	0.404	0.308	1.741	1.139	0.605
bicycle	0.000	0.503	0.453	1.542	1.742	0.095
traffic_cone	0.000	0.352	0.641	nan	nan	nan
barrier	0.027	0.928	0.565	0.519	nan	nan

distance threshold

Object class	Nusc dist AP 0.5	Nusc dist AP 1.0	Nusc dist AP 2.0	Nusc dist AP 4.0	Mean AP
car	34.57	48.84	56.65	61.33	0.5035
truck	0.50	2.11	5.21	5.45	0.0332
bus	0.27	7.59	14.09	16.64	0.0965
trailer	0.00	0.00	0.00	0.00	0.00
construction_vehicle	0.00	0.00	0.00	0.00	0.00
pedestrian	50.82	56.91	59.31	64.35	0.5785
motorcycle	1.61	2.72	2.90	4.46	0.0292
bicycle	0.00	0.04	0.04	0.04	0.0003
traffic_cone	0.00	0.00	0.00	0.00	0.00
barrier	0.00	0.27	2.70	8.03	0.0275

distance threshold가 높을수록, 즉 분류 정답 기준이 낮을수록 (당연하게도) AP값이 높다.

3.3. Graph

(당연하게도) mini-dataset으로 학습한 내 모델보다 전체 dataset으로 학습한 pretrained model의 성능이 월등하다.

3.3.1. Own model

3.3.2. Pretrained model

3.4. Visualization

Images
- n015-2018-10-08-15-36-50+0800__CAM_FRONT__1538984244862460.jpg
- n015-2018-10-08-15-36-50+0800__CAM_FRONT_RIGHT__1538984244520339.jpg
- n015-2018-10-08-15-36-50+0800__CAM_BACK_RIGHT__1538984244627893.jpg
- n015-2018-10-08-15-36-50+0800__CAM_BACK__1538984244687525.jpg
- n015-2018-10-08-15-36-50+0800__CAM_BACK_LEFT__1538984244797423.jpg
- n015-2018-10-08-15-36-50+0800__CAM_FRONT_LEFT__1538984244854844.jpg

녹색 박스가 ground truth, 파란색 박스가 prediction이다.
왼쪽이 mini-dataset으로 학습한 내 model, 오른쪽이 전체 dataset으로 학습된 pretrained model의 결과이다. 내 model의 경우 위치는 어느 정도 예측했지만, 방향을 정확히 예측하지는 못했다. 반면 pretrained model은 위치와 방향 모두 잘 예측했다.

4. TODO List

TensorRT inference 엔진을 적용하여 inference 속도 확인
Hyperparameter 변경하며 학습 후 결과 비교(e.g. epochs)
SOTA 모델 적용
Waymo dataset 이용

📙 참고

zzwon1212

JUST DO IT.

다음 포스트