[AMD DL Study] 13 ~ 14주차 Vitis AI tutorials(3)

Loloh_the_great·2024년 2월 5일

CNN FPGA Vitis-AI

AMD-DL

목록 보기

11/12

Refence : https://github.com/Xilinx/Vitis-AI-Tutorials/tree/3.5/Tutorials/RESNET18/

목차.

Review
Vitis AI tutorials : ImageNet Dataset
Summary

1. Review

지난 게시글에서는 Vitis-AI를 활용하여 Cifar10이라는 dataset을 이용해 ResNet18이라는 CNN 네트워크를 구현하였고 이를 실제로 보드에 돌려보는 작업까지 하였다. 이번에는 ImageNet이라는 dataset을 활용해 실제 보드에서 추론하여 정확도를 측정하는 것까지를 목표로 하고 있다.

2. Vitis AI tutorials : ImageNet Dataset

ImageNet Dataset이라는 데이타셋을 활용하여 합성신경망을 구현할 것이다. ImageNet은 cifar10 보다 복잡한 데이타셋을 활용하므로 이번에는 ResNet50이라는 신경망을 사용할 것이고 이를 ResNet18 신경망에 맞춰서 사용할 것이다.

ImageNet : 10,000 개의 class를 가지고 천만 개의 라벨링 된 이미지를 가진 데이타셋이다.

모든 명령어는 run_all.sh 파일에 들어 있어서 source run_all.sh main_imagenet 이라는 명령어로 실행할 수 있지만 이를 실행하기 전에 앞서 val_dataset.zip이라는 파일을 만들어야 한다.

val_dataset.zip : 이미지 크기가 256개가 되도록 크기가 조정된 500개(원래 50000 데이터셋의 마지막 데이터셋)의 아카이브

2.1 Prepare the Test Images

2.1-1 Download the ImageNet Validation Dataset

이 예제에서는 다시 학습을 할 필요가 없으므로 ILSVRC2012_img_val.tar 이라는 파일을 다운 받아서 필요한 파일들을 생성해야 한다. 아래의 링크에서 다운 받도록 하자.

https://www.image-net.org/download.php

다운로드를 위해서 계정을 생성해야 한다. 6.4 GB 정도의 데이타를 찾아서 다운 받으면 된다.

다음은 val.txt 파일을 만드는 과정이다.

val.txt 파일과 words.txt 파일이 없으면 양자화 작업 중 갑자기 중단될 때가 있기 때문에 파일이 어디있는지 경로를 잘 파악해야 한다.

mkdir tmp
cd tmp
wget -c http://dl.caffe.berkeleyvision.org/caffe_ilsvrc12.tar.gz
tar -xvf caffe_ilsvrc12.tar.gz
mv val.txt ..
cd ..
rm -r tmp

2.1.3 Generate the Test Images Archive

ILSVRC2012_img_val.tar이라는 파일을 다 다운 받았으면 val_dataset.zip이라는 파일을 만들어야 한다. 실행 코드는 아래와 같다.

# enter in the docker image
cd ${WRK_DIR} # you are now in Vitis_AI subfolder
./docker_run.sh xilinx/vitis-ai-tensorflow2-gpu
conda activate vitis-ai-tensorflow2
cd /workspace/tutorials/

# starting directory
cd /workspace/tutorials/RESNET18/files/
cd modelzoo/ImageNet/ # your current directory
# the ILSVRC2012_img_val.tar 파일을 이 디렉토리에다가 넣어둔다.

# 하위레벨에 디렉토리를 만든다.
mkdir val_dataset
mv ILSVRC2012_img_val.tar ./val_dataset/
# tar 파일을 확장시켜준다.
cd val_dataset
tar -xvf ILSVRC2012_img_val.tar > /dev/null
# move back the archive one level above
mv ILSVRC2012_img_val.tar ../
cd ..
# check all the 50000 images are in val_dataset folder
ls -l ./val_dataset | wc

이제 아래와 같은 메시지가 출력될 것이다.

ls -l ./val_dataset | wc
50001  450002 4050014

이제 imagenet_val_dataset.py을 실행한다.

python3 imagenet_val_dataset.py

아래와 같은 메시지가 출력된다.

val_dataset.zip archive is ready with  500 images

이제 val_dataset.zip을 target/imagenet 경로로 옮겨준다.

cp -i val_dataset.zip ../../target/imagenet

이 단계까지 끝내면 ResNet50과 ResNet18을 실행할 준비가 된 것이다.

2.2 ResNet 50

2.2.1 Get ResNet50 from Vitis AI Model Zoo

tf2_resnet50_3.5.zip 을 Vitis-AI/model_zoo/model-list/tf2_resnet50_3.5에서 찾아서 ~/tutorials/modelzoo 폴더에 넣어준다. 그 다음 압축을 해제해준다.

2.2.2 Evaluate Original and Quantized Models

eval_resnet50.py에 의해 아래의 단계들이 실행된다.

tensorflow.keras.applications에서 32비트 부동 소수점 모델을 로드한다. 이러한 모델은 이미 Vitis AI Model Zoo 내에서 사용할 수 있다.

이러한 모델을 사용하여 val_ dataset.zip 아카이브에서 사용 가능한 500개의 이미지에 대해 상위 1개 및 상위 5개의 평균 예측 정확도를 평가한다.

이제 32bit 를 int8로 양자화하여 경량화를 해준 후 .h5 파일로 저장을 해준다.

cd /workspace/tutorials/RESNET18/files
cd modelzoo/ImageNet
unzip val_dataset.zip
cd ../..
bash ./run_all.sh quantize_resnet50_imagenet

실행한 결과는 아래와 같다.

[DB INFO] Evaluate Average Prediction Accuracy of ResNet50 CNN...

2024-02-04 22:33:56.661033: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int32
         [[{{node Placeholder/_0}}]]
2024-02-04 22:33:58.483311: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 160563200 exceeds 10% of free system memory.
2024-02-04 22:33:58.606530: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 160563200 exceeds 10% of free system memory.
2024-02-04 22:33:58.711572: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 166348800 exceeds 10% of free system memory.
2024-02-04 22:33:58.801725: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 160563200 exceeds 10% of free system memory.
2024-02-04 22:33:58.899362: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 160563200 exceeds 10% of free system memory.
10/10 [==============================] - 48s 5s/step - loss: 1.0644 - sparse_categorical_accuracy: 0.7540 - sparse_top_k_categorical_accuracy: 0.9240
Original  ResNet50 top1, top5:  0.7540000081062317 0.9240000247955322

[DB INFO] Evaluation of ResNet50 Quantized Model...

WARNING:tensorflow:No training configuration found in the save file, so the model was *not* compiled. Compile it manually.
16/16 [==============================] - 53s 3s/step - loss: 1.0823 - sparse_categorical_accuracy: 0.7500 - sparse_top_k_categorical_accuracy: 0.9240
Quantized ResNet50 top1, top5:  0.75 0.9240000247955322

==================================================================================================
Total params: 25,636,712
Trainable params: 25,583,592
Non-trainable params: 53,120
__________________________________________________________________________________________________

2.3 ResNet 18

앞에서와 같이 eval_resnet18.py 라는 파이썬 스크립트에 의해서 실행이 된다.

colab-calssifier에서 32bit 모델을 로드한다.

500 개의 이미지의 top1과 top5의 정확도를 판단한다.

양자화 하여 8bit의 .h5로 만들어서 로드한다.

똑같이 양자화 된 모델의 정확도를 판단한다.

위의 작업을 실행하기 위해 아래의 명령어를 입력해준다.

cd /workspace/tutorials/RESNET18/files
cd modelzoo/ImageNet
unzip val_dataset.zip
cd ../..
bash ./run_all.sh quantize_resnet18_imagenet

아래는 패러미터다.

==================================================================================================
Total params: 11,699,889
Trainable params: 11,691,947
Non-trainable params: 7,942
__________________________________________________________________________________________________

아래는 원래의 32비트 부동소수점 모델이다.

Original  ResNet18 top1, top5:  0.6899999976158142 0.9079999923706055

아래는 8비트의 integer로 양자화된 모델이다.

Quantized ResNet18 top1, top5:  0.6460000276565552 0.8730000257492065

이 때 ResNet50과 비교하여 ResNet18의 정확도가 다르단 것을 확인할 수 있는데 이는 전처리 방법이 다르기 때문이다.

차이점
1. 평균값: 0,0,0 (ResNet18) 123.68, 116.78, 103.94 (ResNet50)
이미지 타입 : RGB (ResNet18), BGR (ResNet50).

2.4 Run on the Target Board

전과 같은 과정을 통해서 ResNet18과 ResNet50을 타겟 보드에 실행하여 분류 할 수 있다.

이미지를 분류하는 C++로 작성된 main_resnet50.cc 파일은 ResNet18과 ResNet50을 둘 다 실행 가능하다.

2.4.1 Run Time Execution

전에 방법과 같은 방법으로 실행하면 된다.

실행 예시

Loloh_the_great

병아리가 되고 싶었으나 삶은 달걀이 되어버린 전자공학 졸업생