Perceptron

์ฐฝ์Šˆยท2025๋…„ 4์›” 9์ผ

Deep Learning

๋ชฉ๋ก ๋ณด๊ธฐ
10/16
post-thumbnail

๐Ÿง  ์‹ ๊ฒฝ๋ง์ด๋ž€?

  • ๋”ฅ๋Ÿฌ๋‹์€ 1950๋…„๋Œ€๋ถ€ํ„ฐ ์—ฐ๊ตฌ๋œ ์ธ๊ณต ์‹ ๊ฒฝ๋ง(ANN)์—์„œ ์‹œ์ž‘๋˜์—ˆ๋‹ค.
  • ์ธ๊ณต ์‹ ๊ฒฝ๋ง์€ ์ธ๊ฐ„์˜ ๋‡Œ ๊ตฌ์กฐ์—์„œ ์˜๊ฐ์„ ๋ฐ›์•„ ๋งŒ๋“ค์–ด์ง„ ์ปดํ“จํŒ… ๊ตฌ์กฐ์ด๋‹ค.

์ „ํ†ต์ปดํ“จํ„ฐ์™€์˜ ์ฐจ์ด์ 

์‹ ๊ฒฝ๋ง์€ ์œ ๋‹› ๋˜๋Š” ๋…ธ๋“œ๋“ค์ด ์—ฐ๊ฒฐ๋˜์–ด ๋™์ž‘ํ•˜๋ฉฐ, ์ž…๋ ฅ์˜ ์ดํ•ฉ์„ ํ•จ์ˆ˜๋กœ ์ฒ˜๋ฆฌํ•˜์—ฌ ์ถœ๋ ฅ์„ ์ „๋‹ฌํ•œ๋‹ค.

์‹ ๊ฒฝ๋ง์˜ ์žฅ์ 

ํ•™์Šต ๊ฐ€๋Šฅ์„ฑ: ๋ฐ์ดํ„ฐ๋งŒ ์žˆ๋‹ค๋ฉด ์˜ˆ์ œ๋ฅผ ํ†ตํ•ด ์Šค์Šค๋กœ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋‹ค.
์˜ค๋™์ž‘์— ๊ฐ•ํ•œ ๊ตฌ์กฐ: ์ผ๋ถ€ ์œ ๋‹›์ด ๊ณ ์žฅ ๋‚˜๋„ ์ „์ฒด ์„ฑ๋Šฅ์— ํฐ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š”๋‹ค.


ํผ์…‰ํŠธ๋ก  Perceptron

  • ํผ์…‰ํŠธ๋ก ์€ 1957๋…„์— '๋กœ์  ๋ธ”๋ผํŠธ'๊ฐ€ ๊ณ ์•ˆํ•œ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์ด๋‹ค. ์ด๋Š”, ๊ฐ€์žฅ ๋‹จ์ˆœํ•œ ์ธ๊ณต์‹ ๊ฒฝ๋ง์œผ๋กœ ๋ถˆ๋ฆฐ๋‹ค.
  • ํผ์…‰ํŠธ๋ก ์€ ํ•˜๋‚˜์˜ ์œ ๋‹›๋งŒ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋ธ๋กœ ์–ด๋Ÿฌ๊ฐœ์˜ ์ž…๋ ฅ์„ ๋ฐ›์•„์„œ ํ•˜๋‚˜์˜ ์‹ ํ˜ธ๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ์žฅ์น˜์ด๋‹ค.
  • ๋‰ด๋Ÿฐ์—์„œ๋Š” ์ž…๋ ฅ ์‹ ํ˜ธ์˜ ๊ฐ€์ค‘์น˜ ํ•ฉ์ด ์–ด๋–ค ์ž„๊ณ„๊ฐ’์„ ๋„˜์–ด๊ฐ€๋Š” ๊ฒฝ์šฐ์—๋งŒ ํ™œ์„ฑํ™”๋˜๋ฉฐ 1์„ ์ถœ๋ ฅํ•˜๊ณ , ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์„ ์ถœ๋ ฅํ•œ๋‹ค.

y={1ifย (w1x1+w2x2+bโ‰ฅ0)0otherwisey = \begin{cases} 1 & \text{if } (w_1 x_1 + w_2 x_2 + b \ge 0) \\ 0 & \text{otherwise} \end{cases}

์ž…๋ ฅ์ด 2์ด๊ณ  ์ถœ๋ ฅ์ด 1๊ฐœ์ธ ํผ์…‰ํŠธ๋ก  || ๊ฐ€์ค‘์น˜(weight) w1, w2 || ๋ฐ”์ด์–ด์Šค(bias) b

๐Ÿงฎ ํผ์…‰ํŠธ๋ก ์˜ ๋…ผ๋ฆฌ ์—ฐ์‚ฐ

โœ”๏ธ AND ์—ฐ์‚ฐ

x1x2y
000
100
010
111

โœ”๏ธ OR ์—ฐ์‚ฐ

x1x2y
000
101
011
111
  • z=w1x1+w2x2+bz = w_1 x_1 + w_2 x_2 + b
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜: f(z)={1ifย zโ‰ฅ00ifย z<0f(z) = \begin{cases} 1 & \text{if } z \ge 0 \\ 0 & \text{if } z < 0 \end{cases}

ํ•ด๋‹น ํผ์…‰ํŠธ๋ก ์€ AND ์—ฐ์‚ฐ์— ๋Œ€ํ•œ ๋ฌธ์ œ๊ฐ€ ์—†์Œ


ํผ์…‰ํŠธ๋ก  ๊ตฌํ˜„ ์ฝ”๋“œ

# ์ˆœ์ˆ˜ ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„ํ•œ ํผ์…‰ํŠธ๋ก 
epsilon = 1e-7

def perceptron(x1, x2):
    w1, w2, b = 1.0, 1.0, -1.5
    sum = x1 * w1 + x2 * w2 + b
    if sum > epsilon:
        return 1
    else:
        return 0

print(perceptron(0, 0))  # ์ถœ๋ ฅ: 0
print(perceptron(1, 0))  # ์ถœ๋ ฅ: 0
print(perceptron(0, 1))  # ์ถœ๋ ฅ: 0
print(perceptron(1, 1))  # ์ถœ๋ ฅ: 1
์ถœ๋ ฅ:
0
0
0
1
# Numpy๋ฅผ ์‚ฌ์šฉํ•œ ํผ์…‰ํŠธ๋ก  ๊ตฌํ˜„
import numpy as np

epsilon = 1e-7

def perceptron(x1, x2):
    X = np.array([x1, x2])
    W = np.array([1.0, 1.0])
    B = -1.5
    sum = np.dot(W, X) + B
    if sum > epsilon:
        return 1
    else:
        return 0

print(perceptron(0, 0))  # ์ถœ๋ ฅ: 0
print(perceptron(1, 0))  # ์ถœ๋ ฅ: 0
print(perceptron(0, 1))  # ์ถœ๋ ฅ: 0
print(perceptron(1, 1))  # ์ถœ๋ ฅ: 1
์ถœ๋ ฅ:
0
0
0
1

ํผ์…‰ํŠธ๋ก  ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜

  • ํผ์…‰ํŠธ๋ก ๋„ ํ•™์Šต์„ ํ•œ๋‹ค
    ์‹ ๊ฒฝ๋ง์ด ํ•™์Šตํ•œ๋‹ค๊ณ  ๋งํ•˜๋ ค๋ฉด, ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ๋žŒ์ด ์ผ์ผ์ด ์„ค์ •ํ•˜์ง€ ์•Š์•„๋„ ์Šค์Šค๋กœ ์กฐ์ •ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ํ•„์š”ํ•˜๋‹ค.
    ํผ์…‰ํŠธ๋ก ์—๋Š” ์ด๋Ÿฌํ•œ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์กด์žฌํ•˜๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

  • ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์˜ ๊ตฌ์„ฑ
    ํผ์…‰ํŠธ๋ก ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ์˜ ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค:

    (๐‘ฅโ‚,dโ‚),(๐‘ฅโ‚‚,dโ‚‚),...,(๐‘ฅโ‚˜,dโ‚˜)(๐‘ฅโ‚, dโ‚), (๐‘ฅโ‚‚, dโ‚‚), ..., (๐‘ฅโ‚˜, dโ‚˜)

    ์—ฌ๊ธฐ์„œ,
    xkx_k: ์ž…๋ ฅ ๋ฒกํ„ฐ (feature vector)
    dkd_k: ํ•ด๋‹น ์ž…๋ ฅ์— ๋Œ€ํ•œ ์ •๋‹ต (target label)
    mm: ์ „์ฒด ํ›ˆ๋ จ ์ƒ˜ํ”Œ ์ˆ˜

  • ํผ์…‰ํŠธ๋ก  ์ถœ๋ ฅ ์‹

    yk(t)=f(w(t)โ‹…xk+b)y_k^{(t)} = f(w^{(t)} \cdot x_k + b)

    ์—์„œ ๋ฐ”์ด์–ด์Šค(bb)๋ฅผ ๊ฐ€์ค‘์น˜ w0w_0(input์„ ํ•˜๋‚˜ ๋„ฃ๋Š” ์‹์œผ๋กœ)๋กœ ๊ฐ„์ฃผ

    yk(t)=f(w(t)โ‹…xk)y_k^{(t)} = f(w^{(t)} \cdot x_k)

    ์—ฌ๊ธฐ์„œ,
    x(t)x^{(t)}: ํ˜„์žฌ ์‹œ์  tt์—์„œ์˜ ๊ฐ€์ค‘์น˜ ๋ฒกํ„ฐ
    xkx_k: ์ž…๋ ฅ ๋ฒกํ„ฐ
    f(โ‹…)f(โ‹…): ํ™œ์„ฑํ™” ํ•จ์ˆ˜ (์˜ˆ: ๊ณ„๋‹จ ํ•จ์ˆ˜)


โœ… ์•Œ๊ณ ๋ฆฌ์ฆ˜ ํ•™์Šต ์ ˆ์ฐจ

Input ํ•™์Šต ๋ฐ์ดํ„ฐ: (๐‘ฅโ‚,dโ‚),(๐‘ฅโ‚‚,dโ‚‚),...,(๐‘ฅโ‚˜,dโ‚˜)(๐‘ฅโ‚, dโ‚), (๐‘ฅโ‚‚, dโ‚‚), ..., (๐‘ฅโ‚˜, dโ‚˜)

  • ๐‘ฅk๐‘ฅ_k๋Š” ์ž…๋ ฅ ๋ฒกํ„ฐ, dkd_k๋Š” ํ•ด๋‹น ์ž…๋ ฅ์˜ ์ •๋‹ต์ด๋‹ค.

โœ”๏ธ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋‹จ๊ณ„

  1. ๊ฐ€์ค‘์น˜ ์ดˆ๊ธฐํ™”
    ๋ชจ๋“  ๊ฐ€์ค‘์น˜ wiw_i์™€ ๋ฐ”์ด์–ด์Šค bb๋ฅผ 0 ๋˜๋Š” ์ž‘์€ ๋‚œ์ˆ˜๋กœ ์ดˆ๊ธฐํ™”ํ•œ๋‹ค.

  2. ํ•™์Šต ๋ฐ˜๋ณต
    ๊ฐ€์ค‘์น˜๊ฐ€ ๋” ์ด์ƒ ๋ณ€๊ฒฝ๋˜์ง€ ์•Š์„ ๋•Œ๊นŒ์ง€ ๋‹ค์Œ ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•œ๋‹ค:

  3. ๊ฐ ์ƒ˜ํ”Œ์— ๋Œ€ํ•ด ๋‹ค์Œ์„ ์ˆ˜ํ–‰

    • ์ถœ๋ ฅ ๊ณ„์‚ฐ:
      yk(t)=f(w(t)โ‹…xk)y_k^{(t)} = f(w^{(t)} \cdot x_k)

    • ๊ฐ€์ค‘์น˜ ๊ณ„์‚ฐ:
      wi(t+1)=wi(t)+ฮทโ‹…(dkโˆ’yk(t))โ‹…xk,iw_i^{(t+1)} = w_i^{(t)} + \eta \cdot (d_k - y_k^{(t)}) \cdot x_{k,i}

      • xk,ix_{k,i}๋Š” kk ๋ฒˆ์งธ ์ž…๋ ฅ ๋ฒกํ„ฐ์˜ ii๋ฒˆ์งธ ์š”์†Œ
      • dkd_k๋Š” ์ •๋‹ต, yk(t)y^{(t)}_k๋Š” ํ˜„์žฌ ์ถœ๋ ฅ๊ฐ’
  • ํ•™์Šต๋ฅ  ฮทฮท (learning rate)
    0<ฮทโ‰ค10<ฮทโ‰ค1 ๋ฒ”์œ„์˜ ๊ฐ’์ด๋ฉฐ, ๊ฐ€์ค‘์น˜๊ฐ€ ์–ผ๋งˆ๋‚˜ ๋น ๋ฅด๊ฒŒ ๋ณ€ํ™”ํ• ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ์ด๋‹ค.

  • ์†์‹คํ•จ์ˆ˜ wiw_i์— ๋Œ€ํ•œ ์ผ์ฐจ ๋ฏธ๋ถ„ ๊ฐ’์ดโˆ’2(dkโˆ’yk(t))โ‹…xk,i-2 (d_k - y_k^{(t)}) \cdot x_{k,i} ์ด๋ฏ€๋กœ ์œ„์™€ ๊ฐ™์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ฐ€๋Šฅํ•˜๋‹ค.


โœ”๏ธ ์ถœ๋ ฅ ์˜ค์ฐจ์— ๋”ฐ๋ฅธ ๊ฐ€์ค‘์น˜ ๋ณ€ํ™”

์ •๋‹ต์€ 1์ธ๋ฐ ์ถœ๋ ฅ์ด 0์ธ ๊ฒฝ์šฐ (False Negative):

ฮ”wi=ฮทโ‹…(1โˆ’0)โ‹…xk,i=ฮทโ‹…xk,i\Delta w_i = \eta \cdot (1 - 0) \cdot x_{k,i} = \eta \cdot x_{k,i}

โ†’ ๊ฐ€์ค‘์น˜๊ฐ€ ์ฆ๊ฐ€ โ†’ ์ถœ๋ ฅ์ด ๋” ์ปค์งˆ ๊ฐ€๋Šฅ์„ฑ ์ฆ๊ฐ€ โ†’ ๋‹ค์Œ์—๋Š” 1๋กœ ๋ถ„๋ฅ˜ํ•  ๊ฐ€๋Šฅ์„ฑ โ†‘

์ •๋‹ต์€ 0์ธ๋ฐ ์ถœ๋ ฅ์ด 1์ธ ๊ฒฝ์šฐ (False Positive):

ฮ”wi=ฮทโ‹…(0โˆ’1)โ‹…xk,i=โˆ’ฮทโ‹…xk,i\Delta w_i = \eta \cdot (0 - 1) \cdot x_{k,i} = -\eta \cdot x_{k,i}

โ†’ ๊ฐ€์ค‘์น˜๊ฐ€ ๊ฐ์†Œ โ†’ ์ถœ๋ ฅ์ด ์ž‘์•„์งˆ ๊ฐ€๋Šฅ์„ฑ ์ฆ๊ฐ€ โ†’ ๋‹ค์Œ์—๋Š” 0์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ๊ฐ€๋Šฅ์„ฑ โ†‘


ํผ์…‰ํŠธ๋ก  ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ฝ”๋“œ

# ํผ์…‰ํŠธ๋ก  ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ตฌํ˜„

def perceptron_fit(X, Y, epochs=10):
    global W
    eta = 0.2  # ํ•™์Šต๋ฅ 

    for t in range(epochs):
        print("epoch =", t, "======================")
        for i in range(len(X)):
            predict = step_func(np.dot(X[i], W))
            error = Y[i] - predict  # ์˜ค์ฐจ ๊ณ„์‚ฐ
            W += eta * error * X[i]  # ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ
            print("ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ =", X[i], 
                  "์ •๋‹ต =", Y[i], 
                  "์ถœ๋ ฅ =", predict, 
                  "๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜ =", W)
        print("================================")

# ํผ์…‰ํŠธ๋ก  ์˜ˆ์ธก ํ•จ์ˆ˜
def perceptron_predict(X, Y):
    global W
    for x in X:
        print(x[0], x[1], "->", step_func(np.dot(x, W)))

# ํ•™์Šต ๋ฐ ์˜ˆ์ธก ์‹คํ–‰
perceptron_fit(X, y, 6)
perceptron_predict(X, y)
epoch= 0 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0. 0. 0.]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0. 0. 0.]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0. 0. 0.]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0.2 0.2 0.2]
================================
epoch= 1 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0.2 0.2 0. ]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.2 0. -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.2 0. -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0.4 0.2 0. ]
================================
epoch= 2 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [0.4 0.2 0. ]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0. -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.2 0. -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.2]
================================
epoch= 3 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.2 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.4 -0.2]
================================
epoch= 4 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.4 -0.2]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
================================
epoch= 5 ======================
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [0 1 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 0 1] ์ •๋‹ต= 0 ์ถœ๋ ฅ= 0 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
ํ˜„์žฌ ์ฒ˜๋ฆฌ ์ž…๋ ฅ= [1 1 1] ์ •๋‹ต= 1 ์ถœ๋ ฅ= 1 ๋ณ€๊ฒฝ๋œ ๊ฐ€์ค‘์น˜= [ 0.4 0.2 -0.4]
================================
0 0 -> 0
0 1 -> 0
1 0 -> 0
1 1 -> 1

sklearn(scikit-learn)๊ธฐ๋ฐ˜ ํผ์…‰ํŠธ๋ก  ์ฝ”๋“œ

from sklearn.linear_model import Perceptron

# ์ƒ˜ํ”Œ๊ณผ ๋ ˆ์ด๋ธ”
X = [[0, 0], [0, 1], [1, 0], [1, 1]]
y = [0, 0, 0, 1]

# ํผ์…‰ํŠธ๋ก  ๋ชจ๋ธ ์ƒ์„ฑ
# tol์€ ์กฐ๊ธฐ ์ข…๋ฃŒ ์กฐ๊ฑด, random_state๋Š” ๋‚œ์ˆ˜ ์‹œ๋“œ
clf = Perceptron(tol=1e-3, random_state=0)

# ํ•™์Šต ์ˆ˜ํ–‰
clf.fit(X, y)

# ์˜ˆ์ธก ์ˆ˜ํ–‰
print(clf.predict(X))  # ์ถœ๋ ฅ: [0 0 0 1] (์˜ˆ์ƒ)
[0 0 0 1]

ํผ์…‰ํŠธ๋ก ์˜ ํ•œ๊ณ„์ 

  • XOR ์—ฐ์‚ฐ

์•„๋ž˜์™€ ๊ฐ™์ด ์›ํ•˜๋Š” ์ถœ๋ ฅ์ด ๋‚˜์˜ค์ง€ ์•Š๋Š”๋‹ค.

โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ
X = np.array([ # ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ ์„ธํŠธ
[0, 0, 1], # ๋งจ ๋์˜ 1์€ ๋ฐ”์ด์–ด์Šค๋ฅผ ์œ„ํ•œ ์ž…๋ ฅ ์‹ ํ˜ธ 1์ด๋‹ค.
[0, 1, 1], # ๋งจ ๋์˜ 1์€ ๋ฐ”์ด์–ด์Šค๋ฅผ ์œ„ํ•œ ์ž…๋ ฅ ์‹ ํ˜ธ 1์ด๋‹ค.
[1, 0, 1], # ๋งจ ๋์˜ 1์€ ๋ฐ”์ด์–ด์Šค๋ฅผ ์œ„ํ•œ ์ž…๋ ฅ ์‹ ํ˜ธ 1์ด๋‹ค.
[1, 1, 1] # ๋งจ ๋์˜ 1์€ ๋ฐ”์ด์–ด์Šค๋ฅผ ์œ„ํ•œ ์ž…๋ ฅ ์‹ ํ˜ธ 1์ด๋‹ค.
])
y = np.array([0, 1, 1, 0]) # ์ •๋‹ต์„ ์ €์žฅํ•˜๋Š” ๋„˜ํŒŒ์ด ํ–‰๋ ฌ (XOR)
โ€ฆโ€ฆโ€ฆโ€ฆโ€ฆ
...
0 0 -> 1
0 1 -> 1
1 0 -> 0
1 1 -> 0

โœ… ์„ ํ˜• ๋ถ„๋ฅ˜ ๊ฐ€๋Šฅ ๋ฌธ์ œ

  • ํผ์…‰ํŠธ๋ก ์€ ์„ ํ˜• ๋ถ„๋ฅ˜์ž์ด๋‹ค
    ํผ์…‰ํŠธ๋ก ์€ ์ž…๋ ฅ ํŒจํ„ด์„ ์ง์„ (linear decision boundary)์„ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ์„ ํ˜• ๋ถ„๋ฅ˜์ž(linear classifier)์ด๋‹ค.

  • ํผ์…‰ํŠธ๋ก ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„
    ํผ์…‰ํŠธ๋ก ์—์„œ ๊ฐ€์ค‘์น˜์™€ ๋ฐ”์ด์–ด์Šค๋Š” 2์ฐจ์› ์ž…๋ ฅ ๊ณต๊ฐ„์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ง์„ ์œผ๋กœ ํ•ด์„๋œ๋‹ค: ๊ทธ ํ›„ ์˜์—ญ์— ๋”ฐ๋ผ ์ถœ๋ ฅ์„ 0๊ณผ 1๋กœ ๋ถ„๋ฅ˜๋œ๋‹ค.

    w1x1+w2x2+b=0w_1 x_1 + w_2 x_2 + b = 0

ํ•œ ๊ฐœ์˜ ์ง์„ ์œผ๋กœ 0๊ณผ 1์„ ๋ช…ํ™•ํ•˜๊ฒŒ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ๋ฅผ ์„ ํ˜• ๋ถ„๋ฆฌ ๊ฐ€๋Šฅ ๋ฌธ์ œ(linear separable problem)๋ผ๊ณ  ํ•œ๋‹ค.

โ€ข AND๋‚˜ OR์—ฐ์‚ฐ์€ ์„ ํ˜• ๋ถ„๋ฅ˜ ๊ฐ€๋Šฅ ๋ฌธ์ œ์ด๋‚˜, XOR์€ ๊ทธ๋ ‡์ง€ ์•Š๋‹ค.


โœ… ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์„ ํ†ตํ•œ XOR๋ฌธ์ œ ํ•ด๊ฒฐ

  • XOR ๋ฌธ์ œ๋Š” ์ง์„  ํ•˜๋‚˜๋กœ ํ•ด๊ฒฐ๋˜์ง€ ์•Š๋Š”๋‹ค
    XOR ์—ฐ์‚ฐ์€ ๋‹จ์ผ ์ง์„ ์œผ๋กœ๋Š” ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์—†๋Š” ์„ ํ˜• ๋ถ„๋ฆฌ ๋ถˆ๊ฐ€๋Šฅ ๋ฌธ์ œ์ด๋‹ค.
    ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜ํ•˜๋ ค๋ฉด ์ตœ์†Œ 2๊ฐœ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„(์ง์„ )๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

  • ๋‹ค์ธต ๊ตฌ์กฐ์˜ ํ•„์š”์„ฑ
    ํ•˜๋‚˜์˜ ์œ ๋‹›(๋…ธ๋“œ)๋กœ๋Š” XOR ๋ฌธ์ œ๋ฅผ ๋ถ„๋ฆฌํ•  ์ˆ˜ ์—†๋‹ค.
    ๋”ฐ๋ผ์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด 3๊ฐœ์˜ ์œ ๋‹›(yโ‚, yโ‚‚, y)์ด ํ•„์š”ํ•˜๋‹ค:

    • y1y_1: ์ฒซ ๋ฒˆ์งธ ์ง์„ ์„ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฅ˜
    • y1y_1: ๋‘ ๋ฒˆ์งธ ์ง์„ ์„ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฅ˜
    • yy: y1,y2y_1, y_2์˜ ์ถœ๋ ฅ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ตœ์ข… ์ถœ๋ ฅ
  • ์€๋‹‰์ธต์„ ์ถ”๊ฐ€ํ•œ ๊ตฌ์กฐ: MLP
    ํผ์…‰ํŠธ๋ก ์—์„œ ์ž…๋ ฅ์ธต๊ณผ ์ถœ๋ ฅ์ธต ์‚ฌ์ด์— ์€๋‹‰์ธต(hidden layer)์„ ์ถ”๊ฐ€ํ•˜๋ฉด XOR ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค.
    ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋ฅผ ๋‹ค์ธต ํผ์…‰ํŠธ๋ก (Multilayer Perceptron, MLP)์ด๋ผ๊ณ  ํ•œ๋‹ค.


โœ… ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ๊ณผ ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜

  • ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์˜ ์ธต ์ˆ˜ ์ •์˜
    ์ผ๋ฐ˜์ ์œผ๋กœ ์‹ ๊ฒฝ๋ง์—์„œ ์ž…๋ ฅ์ธต, ์€๋‹‰์ธต, ์ถœ๋ ฅ์ธต์ด ์กด์žฌํ•˜์ง€๋งŒ,
    ๊ฐ€์ค‘์น˜๊ฐ€ ์ ์šฉ๋˜๋Š” ์ธต์€ ์€๋‹‰์ธต๊ณผ ์ถœ๋ ฅ์ธต๋ฟ์ด๋‹ค.
    ๋”ฐ๋ผ์„œ ์‹ ๊ฒฝ๋ง์ด ์ด 3๊ฐœ์˜ ์ธต์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์–ด๋„, ์‹ค์ œ๋กœ๋Š” 2์ธต ํผ์…‰ํŠธ๋ก ์ด๋ผ๊ณ  ๋ถ€๋ฅด๊ธฐ๋„ ํ•œ๋‹ค.

  • ๊ณผ๊ฑฐ์˜ ํ•œ๊ณ„์™€ ์˜ˆ์–ธ
    Minsky์™€ Papert๋Š” 1969๋…„ ์ €์„œ์—์„œ ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์ด ๋งค์šฐ ์–ด๋ ต๋‹ค๊ณ  ์ฃผ์žฅํ•˜์˜€๋‹ค.
    ์ด ๋•Œ๋ฌธ์— ํ•œ๋™์•ˆ ํผ์…‰ํŠธ๋ก  ์—ฐ๊ตฌ๋Š” ์ •์ฒด๋˜์—ˆ๊ณ , AI ๊ฒจ์šธ์ด๋ผ ๋ถˆ๋ฆฌ๋Š” ์นจ์ฒด๊ธฐ๋ฅผ ๋งž์ดํ•˜๊ฒŒ ๋œ๋‹ค.

  • ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์žฌ๋ฐœ๊ฒฌ
    ํ•˜์ง€๋งŒ 1980๋…„๋Œ€ ์ค‘๋ฐ˜, Rumelhart, Hinton ๋“ฑ์˜ ์—ฐ๊ตฌ์ž๋“ค์ด
    ๋‹ค์ธต ํผ์…‰ํŠธ๋ก ์„ ํšจ์œจ์ ์œผ๋กœ ํ•™์Šต์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์—ญ์ „ํŒŒ ์•Œ๊ณ ๋ฆฌ์ฆ˜(Backpropagation)์„ ์žฌ๋ฐœ๊ฒฌํ•˜์˜€๋‹ค.
    ์ด๋กœ ์ธํ•ด ์ธ๊ณต ์‹ ๊ฒฝ๋ง์€ ๋‹ค์‹œ ์ฃผ๋ชฉ์„ ๋ฐ›๊ฒŒ ๋˜์—ˆ๊ณ , ์˜ค๋Š˜๋‚  ๋”ฅ๋Ÿฌ๋‹์˜ ๊ธฐ๋ฐ˜์ด ๋˜์—ˆ๋‹ค.

0๊ฐœ์˜ ๋Œ“๊ธ€