손글씨 유사도 검사 코드 들여다보기

이양규·2022년 2월 14일

✔ 벡터의 내적은 두 벡터 간의 유사도를 계산하는 데도 이용할 수 있다.

전체 코드

from sklearn.datasets import load_digits
import matplotlib.gridspec as gridspec

digits = load_digits()
d1 = digits.images[0]
d2 = digits.images[10]
d3 = digits.images[1]
d4 = digits.images[11]
v1 = d1.reshape(64, 1)
v2 = d2.reshape(64, 1)
v3 = d3.reshape(64, 1)
v4 = d4.reshape(64, 1)

plt.figure(figsize=(9, 9))
gs = gridspec.GridSpec(1, 8, height_ratios=[1],
                       width_ratios=[9, 1, 9, 1, 9, 1, 9, 1])
for i in range(4):
    plt.subplot(gs[2 * i])
    plt.imshow(eval("d" + str(i + 1)), aspect=1,
               interpolation='nearest', cmap=plt.cm.bone_r)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    plt.title("image {}".format(i + 1))
    plt.subplot(gs[2 * i + 1])
    plt.imshow(eval("v" + str(i + 1)), aspect=0.25,
               interpolation='nearest', cmap=plt.cm.bone_r)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    plt.title("vector {}".format(i + 1))
plt.tight_layout()
plt.show()

💻 코드로 학습한 파이썬

✔ 패키지 임포트

◻ from sklearn.datasets import load_digits
작은 데이터셋을 제공하는 패키지
그 중에서 숫자 데이터를 제공함

◻ import matplotlib.gridspec
Figure를 그릴 때 영역을 자유자재로 나누고 싶으면 사용함

✔ 손글씨 가져오기

digits = load_digits()
d1 = digits.images[0]
d2 = digits.images[10]
d3 = digits.images[1]
d4 = digits.images[11]

load.digits().images[ 입력값 ] 으로 손글씨 숫자를 가져올 수 있다.
입력값을 10으로 나누었을 때 나머지 값이 가져오는 손글씨 숫자이다.

d1 = digits.images[0]
d2 = digits.images[10]

이 코드를 실행하면, 모두 0을 가져온다.
하지만 같은 0이 아니다. 조금씩 다르다.

images[0] ~ images[9]를 가져오면 다음과 같다.

✔ 2차원 이미지를 1차원으로 변경

내적을 계산하기 위해 2차원 이미지를 1차원으로 변경한다.

⚠ 벡터의 내적을 위한 조건
1. 우선 두 벡터의 차원(길이)이 같아야 한다.
2. 앞의 벡터가 행 벡터이고 뒤의 벡터가 열 벡터여야 한다.

v1 = d1.reshape(64, 1)
v2 = d2.reshape(64, 1)
v3 = d3.reshape(64, 1)
v4 = d4.reshape(64, 1)

✔ 화면에 손글씨를 보여주기 위해 틀 잡기

plt.figure(figsize=(9, 9))
gs = gridspec.GridSpec(1, 8, height_ratios=[1],
                       width_ratios=[9, 1, 9, 1, 9, 1, 9, 1])

◻ plt.figure( figsize=(9, 9) )
전체 화면은 가로 세로 각각 9인치이다.

◻ gs = gridspec.GridSpec(1, 8, height_ratios=[1],
width_ratios=[9, 1, 9, 1, 9, 1, 9, 1])
figure 내에서 서브플롯을 배치하기 위한 그리드 레이아웃

⚠ 그리드 레이아웃은 1행 8열이다.
⚠ height_ratios : 행의 상대적 높이를 정의함.
⚠ width_ratios : 열의 상대적 너비를 정의함

✔ 화면에 손글씨 배치하기

for i in range(4):
    plt.subplot(gs[2 * i])
    plt.imshow(eval("d" + str(i + 1)), aspect=1,
               interpolation='nearest', cmap=plt.cm.bone_r)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    plt.title("image {}".format(i + 1))
    plt.subplot(gs[2 * i + 1])
    plt.imshow(eval("v" + str(i + 1)), aspect=0.25,
               interpolation='nearest', cmap=plt.cm.bone_r)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])
    plt.title("vector {}".format(i + 1))

◻ plt.subplot(gs[2 * i])
한 화면에 여러 개의 그래프 그리기

gs[2 * i]를 수동으로 실행시킨 결과 :
서브플롯을 적절한 위치에 배치시키는 역할을 함

◻ eval("d" + str(i + 1)) / eval("v" + str(i + 1))
"d" + str(i+1) / "v" + str(i + 1) 는 문자열이다.
코드로 동작하게 하려면 코드로 변환이 필요하다.
이 때 사용하는 게 eval()이다.

eval 함수를 사용하지 않았을 때의 결과