NVIDIA 05: Doggy door

김현정·2022년 6월 9일

엔비디아

NVIDIA

목록 보기

6/6

NVIDIA 05: Doggy door

NVIDIA의 Fundatmentals of Deep Learning:05 Doggy door 리뷰를 위한 글이다.
이전 세션까지는 직접 model을 구현했지만 인터넷상에는 이미 pretrained된 좋은 모델들이 많이 존재한다.
pretrained된 좋은 모델을 불러와서 training을 해보자

Objectives

keras를 이용하여 well-trained pretrained 모델을 불러오자
pretrained 모델에 사용할 수 있게 image를 preprocessing해보자
pretrained 모델을 사용하여 inference를 해보자

Doggy Door

우리는 이미지에서 강아지만 판별하여 통과시키려한다.
pretrained 모델로는 ImageNet을 사용할 것이다.
imagenet은 100만개가 넘는 동물이미지를 가지고 training을 진행하여 1000가지 이상의 category를 분류해 낼 수 있다.

Load the model

ImageNet 모델은 Keras library를 사용하여 로드할 수 있다.
우리는 ImageNet의 모델중에서 VGG16을 사용해보려고 한다.

from tensorflow.keras.applications import VGG16

model = VGG16(weights="imagenet")

model summary

우리가 공부했던 CNN모델을 떠올리게 하는 구조라는 것을 확인할 수 있다.
이전 세션에서 학습과 추론을 위해서 input image의 dimension을 모델의 input layer와 같게 맞춰야 했음을 생각해보자

model.summary()

VGG16_summary

Input dim.

input의 dimension이 (num_of_images, 244, 244, 3)라는 것을 알 수 있다. 이것은 이미지가 244*244 RGB의 3개 color channel을 갖고 있다는 것을 의미한다.
모델을 사용하여 inference를 할 때, 우리가 모델에 넘겨주는 이미지도 이와 같은 dimension을 갖도록 reshaping을 해줘야 할 것이다.

Output dim.

output은 1000개의 category를 갖는 것을 확인할 수 있다.
사실 ImageNet을 사용하면 20,000개가 넘는 카테고리를 분류할 수 있지만 우리가 사용하는 모델은 그것의 subset인 1000개의 카테고리 분류를 제공한다.
해당 모델의 category와 index mapping에 대한 정보는 다음 링크에 있다
VGG16_categories
해당 링크의 정보에 따르면 강아지는 index 151~268, 고양이는 281~285에 해당한다.
이후 inference를 할 때, 해당 index의 범위에 드는지를 판별하여 추론을 하면 된다.

Load the image

data/doggy_door_images에 있는 'happy_dog.jpg'를 살펴보자

import matplotlib.pyplot as plt
import matplotlib.image as mpimg

def show_image(image_path):
	image = mpimg.imread(image_path)
    print(image.shape)
    plt.imshow(image)
    
show_image("data/doggy_door_images/happy_dog.jpg")

출력결과 (1200, 1800, 3)의 shape을 갖는 것을 확인할 수 있다.

Preprocessing the image

이제 모델에 넘겨주는 이미지를 (1, 224, 224, 3) 형태로 reshape해야한다.
keras를 통해서 model을 로딩하면 preprocess_input이라는 method를 사용할 수 있다는 이점이 있다.
이 메소드는 모델이 학습에 썼던 dataset과 비슷하게 image를 전처리해준다.

from tensorflow.keras.preprocessing import image as image_utils
from tensorflow.keras.applications.vgg16 import preprocess_input

def load_and_process_image(image_path):
	print('Original image_shape: ', mpimg.imread(image.path).shape)
    image = image_utils.load_img(image_path, target_size=(224,224))
    # PIL format에서 numpy array format으로 바꿈
    image = image_utils.img_to_array(image)
    image = image.reshape(1, 224, 224, 3)
    
    # 원래 ImageNet dataset과 맞게 image를 process해준다
    image = preprocess_input(image)
    
    print('Processed image shape: ', image.shape)
    
    return image

load_and_process_image("data/doggy_door_images/brown_bear.jpg")

vgg_preprocess_image_shape

Prediction

prediction한 output은 1000개의 category중 하나를 뱉을 것이다.
preprocessing단계와 마찬가지로 keras를 통해 load한 모델은 decode_predictions라는 유용한 method를 제공한다.
이 메소드는 prediction array를 readable한 form으로 해석해준다

from tensorflow.keras.applications.vgg16 import decode_predictions

def readable_prediction(image_path):
	show_image(image_path)
    image=load_and_process_image(image_path)
    predictions=model.predict(image)
    print('Predicted:', decode_predictions(predictions, top=3))

prediction을 해보자

readable_prediction("data/doggy_door_images/happy_dog.jpg")

dog_prediction

Only Dogs

여기서 만든 모델을 사용하여 강아지는 in-out을 가능하게 하고
고양이는 못나가게(inside)
다른 동물은 outside에 있게 판별하는 함수를 짜보자

강아지에 해당하는 index와 고양이에 해당하는 index를 사용해서 np.argmax()함수를 사용하면 된다.

import numpy as np

def doggy_door(image_path):
    show_image(image_path)
    image = load_and_process_image(image_path)
    preds = model.predict(image)
    if 151 <= np.argmax(preds) <= 268:
        print("Doggy come on in!")
    elif 281 <= np.argmax(preds) <= 285:
        print("Kitty stay inside!")
    else:
        print("You're not a dog! Stay outside!")

Clear GPU memory

import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

Pretrained model을 사용하여 효과적인 inference를 할 수 있기는 하지만, pretrained model이 사용하고자 하는 data에 완벽하게 fit하지는 않는다.
다음 세션에서 transfer learning에 대해 알아보면서
pretrained model을 나의 data에 맞게 사용하는 법을 알아볼 것이다.

김현정

이전 포스트

NVIDIA 05: Doggy door

NVIDIA

NVIDIA 05: Doggy door

Objectives

Doggy Door

Load the model

model summary

Input dim.

Output dim.

Load the image

Preprocessing the image

Prediction

prediction을 해보자

Only Dogs

Clear GPU memory

Next

NVIDIA 04-2: ASL Prediction

0개의 댓글