NVIDIA 04-2: ASL Prediction

김현정·2022년 6월 9일

NVIDIA

목록 보기

5/6

NVIDIA 04-2: ASL Prediction

NVIDIA의 Fundatmentals of Deep Learning:04-2 ASL Prediction 리뷰를 위한 글이다.
저장했던 well-trained model을 불러와서 새로운 이미지에 대해서 prediction을 수행시켜보자.

Objectives

save된 trained model 불러오기
다른 format의 이미지를 trained model에 맞게 reformat하기
새로운 이미지에 대해 inference해보기

Loading the Model

이전 04 Data Augmentation에서 'asl_model'이라는 폴더에 training한 모델을 저장했었다
그 모델을 다시 불러와서 활용해보자

from tensorflow import keras
model = keras.models.load_model('asl_model')

model summary를 확인하고 싶다면

model.summary()

Preparing the images

새로운 이미지를 가지고 prediction을 해보자.

위 과정은 Inference라고 불린다.
새로운 이미지는 data/asl_images 폴더에 있다

폴더를 확인해보면, 새로운 이미지가 기존 grayscale 28*28보다 더 높은 해상도에 컬러 이미지임을 알 수 있다.

model을 가지고 prediction을 할때, 항상 input의 shape이 model이 training될때 사용됐던 data의 shape과 일치해야한다!
우리는 모델 training 단계에서 shape이 (27455, 28, 28, 1)인 이미지를 사용했다.
이는 27455개의 28*28짜리 grayscale(color channel이 1개)이미지 data를 의미한다.

Showing the Images

import matplotlib.pyplot as plt
import matplotlib.image as mpimg

def show_image(image_path):
	image=mpimg.imread(image_path)
    plt.imshow(image, cmap='gray')
    
show_image('data/asl_images/b.png')

Image Scaling

28*28의 grayscale이미지를 보내야한다.
python으로 image edit을 할 수도 있지만, 우리는 Keras에 내장된 메소드를 사용해보자.

from tensorflow.keras.preprocessing import image as image_utils

def load_and_scale_image(image_path):
	image= image_utils.load_img(image_path, color_mode="grayscale", target_size=(28,28))
    return image

image = load_and_scale_image('data/asl_images/b.png')
plt.imshow(image, cmap='gray')

Preparing the image for Prediction

image를 numpy array로 바꿔야 연산을 편리하게 적용할 수 있으므로 kears의 method를 사용하여 변환한다.

image = image_utils.img_to_array(image)

우리가 원하는 형태는 (num_image, 28, 28, 1)

image = image.reshape(1, 28, 28, 1) # 이미지 1개 28*28 grayscale

data normalization : 0~1사이 값 갖도록

image = image / 255

Making Predictions

prediction = model.predict(image)
print(prediction)

해석

prediction의 결과는 길이 24짜리 array이다.
array의 element 값은 해당 알파벳일 확률(0~1사이)을 나타낸다.
따라서 이중 가장 높은 값을 갖는 것이 해당 이미지를 prediction하는 정답인 것이다

import numpy as np
np.argmax(prediction)

print한 결과 1이 나온다.
이 값은 array에서 index 1번 element(0~23까지의 index)가 가장 높다는 것을 의미한다

우리의 알파벳 data는 j와 z를 제외하고 있다(Moving gesture를 포함하기 때문)

따라서 j와 z를 제외한 알파벳으로 결과를 mapping해보자

alphabet = "abcdefghiklmnopqrstuvwxy"
alphabet[np.argmax(prediction)]

결과는 'b'이다.

Pull it together

위에서 작성했던 코드들을 function하나로 묶어보자
image file path만 주면 prediction을 바로 하는 function을 만들자.

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
from tensorflow.keras.preprocessing import image as image_utils
from tensorflow import keras
import numpy as np

def predict_letter(file_path):
	model = keras.models.load_model('asl_model')
    
    image = mpimg.imread(file_path)
    plt.imshow(image, cmap='gray')
    
    #Load and scale image
    image = image_utils.load_img(file_path, color_mode="grayscale", target_size=(28,28))
    
    #convert to array
    image=image_utils.img_to_array(image)
    
    #reshape imge
    image = image.reshape(1, 28, 28, 1)
    
    #normalize
    image = image/255
    
    #prediction
    prediction = model.predict(image)
    
    #convert prediction to letter
    alphabet = "abcdefghiklmnopqrstuvwxy"
    predicted_letter = alphabet[np.argmax(prediction)]
    
    return predicted_letter

위에서 만든 함수들을 그대로 쓴다면

def predict_letter(file_path):
	show_image(file_path)
    image = load_and_scale_image(file_path)
    image = image_utils.img_to_array(image)
    image = image.reshape(1, 28, 28, 1)
    image = image/255
    prediction = model.predict(image)
    predicted_letter = alphabet[np.argmax(prediction)]
    return predicted_letter

Prediction

predict_letter("data/asl_images/b.png")

predict_letter("data/asl_images/a.png")

GPU Memory clear

import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

robust한 dataset이 없을 때의 deep learning
pre-trained된 model을 가지고 더 빠르게 작업하는 방법을 살펴볼 것임

김현정

이전 포스트

NVIDIA:04 Data Augmention

다음 포스트

NVIDIA 04-2: ASL Prediction

NVIDIA

NVIDIA 04-2: ASL Prediction

Objectives

Loading the Model

Preparing the images

Showing the Images

Image Scaling

Preparing the image for Prediction

Making Predictions

해석

Pull it together

Prediction

GPU Memory clear

Next

NVIDIA:04 Data Augmention

NVIDIA 05: Doggy door

0개의 댓글