๐Ÿน ํŒŒ์ดํ† ์น˜๋กœ ๊ตฌํ˜„ํ•œ ๋…ผ๋ฆฌํšŒ๊ท€

๋ฏผ๋‹ฌํŒฝ์ด์šฐ์œ ยท2024๋…„ 9์›” 18์ผ

๐Ÿน ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ดˆ

๋ชฉ๋ก ๋ณด๊ธฐ
3/4
post-thumbnail

๐Ÿ’ก 1. ๋‹จํ•ญ ๋…ผ๋ฆฌํšŒ๊ท€(Logistic Regression)

  • ๋ถ„๋ฅ˜๋ฅผ ํ•  ๋•Œ ์‚ฌ์šฉํ•˜๋ฉฐ, ์„ ํ˜• ํšŒ๊ท€ ๊ณต์‹์œผ๋กœ๋ถ€ํ„ฐ ๋‚˜์™”๊ธฐ ๋•Œ๋ฌธ์— ๋…ผ๋ฆฌํšŒ๊ท€๋ผ๋Š” ์ด๋ฆ„์ด ๋ถ™์—ฌ์ง
  • ํšŒ๊ท€ ๋ถ„์„์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜์ง€๋งŒ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ์‚ฌ์šฉ
  • ์ฃผ๋กœ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉ
x = torch.tensor([1.0, 2.0, 3.0])
w = torch.tensor([0.1, 0.2, 0.3])
b = torch.tensor(0.5)

z = w1x1 + w2x2 + w3x3 + b

z = torch.dot(w, x) + b # dot: ๋‚ด์ 
z

tensor(1.9000)

sigmoid = nn.Sigmoid()
output = sigmoid(z)
output

tensor(0.8699)

๐Ÿน ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๋ž€?

  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ X์— ๋Œ€ํ•œ ์„ ํ˜• ๊ฒฐํ•ฉ์œผ๋กœ ๊ณ„์‚ฐ๋œ ๊ฒฐ๊ณผ๋ฅผ 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜
  • 0๊ณผ 1 ์‚ฌ์ด์˜ ์—ฐ์†๋œ ๊ฐ’์„ ์ถœ๋ ฅ์œผ๋กœ ํ•˜๊ธฐ ๋•Œ๋ถ„์— ์ž„๊ณ„๊ฐ’(๋ณดํ†ต 0.5)๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํŠน์ • ํด๋ž˜์Šค์— ์†ํ•  ํ™•๋ฅ ์„ ๊ณ„์‚ฐ
  • S์ž ๊ณก์„ ์„ ๊ทธ๋ฆฌ๋ฏ€๋กœ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด ์ตœ์ ํ™”๊ฐ€ ์šฉ์ด
import torch
import torch.nn as nn

x = torch.tensor([1.0, 2.0, 3.0])
w = torch.tensor([0.1, 0.2, 0.3])
b = torch.tensor(0.5)

z = torch.dot(w, x) + b # dot: ๋‚ด์ 
z
> tensor(1.9000)
sigmoid = nn.Sigmoid()
output = sigmoid(z)
output
> tensor(0.8699)
import torch.optim as optim
import matplotlib.pyplot as plt

torch.manual_seed(2024)

x_train = torch.FloatTensor([[0], [1], [3], [5], [8], [11], [15], [20]])
y_train = torch.FloatTensor([[0], [0], [0], [0], [1], [1], [1], [1]])
print(x_train.shape)
print(y_train.shape)
> torch.Size([8, 1])
> torch.Size([8, 1])
plt.figure(figsize=(8,5))
plt.scatter(x_train, y_train)

model = nn.Sequential(
    nn.Linear(1, 1),
    nn.Sigmoid()
)

model
> Sequential(
  (0): Linear(in_features=1, out_features=1, bias=True)
  (1): Sigmoid()
)
list(model.parameters()) # W: 0.0634 b: 0.6625
> [Parameter containing:
 tensor([[0.0634]], requires_grad=True),
 Parameter containing:
 tensor([0.6625], requires_grad=True)]

๐Ÿ’ก 2. ๋น„์šฉํ•จ์ˆ˜(Binary Cross Entropy)

  • ๋…ผ๋ฆฌํšŒ๊ท€์—์„œ๋Š” nn.BCELoss() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Loss ๊ณ„์‚ฐ
  • 1๋ฒˆ ์‹œ๊ทธ๋งˆ, 2๋ฒˆ ์‹œ๊ทธ๋งˆ ์ค‘์—์„œ 1๋ฒˆ ์‹œ๊ทธ๋งˆ๋Š” ์ •๋‹ต์ด ์ฐธ์ด์—ˆ์„ ๋•Œ ๋ถ€๋ถ„, 2๋ฒˆ ์‹œ๊ทธ๋งˆ๋Š” ์ •๋‹ต์ด ๊ฑฐ์ง“์ด์—ˆ์„ ๋•Œ ๋ถ€๋ถ„
y_pred = model(x_train)

loss = nn.BCELoss()(y_pred, y_train)
loss
> tensor(0.6901, grad_fn=<BinaryCrossEntropyBackward0>)
optimizer = optim.SGD(model.parameters(), lr=0.01)
epochs = 1000

for epoch in range(epochs + 1):
    y_pred = model(x_train)
    loss = nn.BCELoss()(y_pred, y_train)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if epoch % 100 == 0:
        print(f'Epoch: {epoch}/{epochs} Loss: {loss:.6f}')
> Epoch: 0/1000 Loss: 0.690111
> Epoch: 100/1000 Loss: 0.612832
> Epoch: 200/1000 Loss: 0.550762
> Epoch: 300/1000 Loss: 0.498473
> Epoch: 400/1000 Loss: 0.454446
> Epoch: 500/1000 Loss: 0.417278
> Epoch: 600/1000 Loss: 0.385753
> Epoch: 700/1000 Loss: 0.358851
> Epoch: 800/1000 Loss: 0.335734
> Epoch: 900/1000 Loss: 0.315727
> Epoch: 1000/1000 Loss: 0.298285
list(model.parameters())
> [Parameter containing:
 tensor([[0.2875]], requires_grad=True),
 Parameter containing:
 tensor([-1.2444], requires_grad=True)]
x_test = torch.FloatTensor([[10]])
y_pred = model(x_test)
y_pred
> tensor([[0.8363]], grad_fn=<SigmoidBackward0>)
# ์ž„๊ณ„์น˜ ์„ค์ •ํ•˜๊ธฐ
# 0.5๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์œผ๋ฉด 1
# 0.5๋ณด๋‹ค ์ž‘์œผ๋ฉด 0
y_bool = (y_pred >= 0.5).float()
y_bool
> tensor([[1.]])

๐Ÿ’ก 3. ๋‹คํ•ญ ๋…ผ๋ฆฌํšŒ๊ท€

x_train = [[1, 2, 1, 1],
           [2, 1, 3, 2],
           [3, 1, 3, 4],
           [4, 1, 5, 5],
           [1, 7, 5, 5],
           [1, 4, 5, 9],
           [1, 7, 7, 7],
           [2, 8, 7, 8]]
# ๋ณ€์ˆ˜ 4๊ฐœ
y_train = [0, 0, 0, 1, 1, 1, 2, 2]
# ํด๋ž˜์Šค 3

x_train = torch.FloatTensor(x_train)
y_train = torch.LongTensor(y_train)
print(x_train.shape)
print(y_train.shape)
> torch.Size([8, 4])
> torch.Size([8])
model = nn.Sequential(
    nn.Linear(4, 3) # ์ž…๋ ฅ 4๊ฐœ ์ถœ๋ ฅ 3๊ฐœ
    # ์•ž์— ๋‹จํ•ญ์—์„œ ์ถœ๋ ฅ์ด 1์ธ ์ด์œ ๋Š” Sigmoid๋กœ 0๋˜๋Š” 1์ด ๋  ํ™•๋ฅ ์„ ๋ฐ˜ํ™˜ํžˆ๊ธฐ ๋•Œ๋ฌธ์—
    # ์ด๋ฒˆ์—๋Š” ๊ฐ๊ฐ์˜ ์ถœ๋ ฅ์— ๋Œ€ํ•œ ํ™•๋ฅ ์„ ๋ฐ˜ํ™˜(ํ™•๋ฅ  ์ด ํ•ฉ์€ 1)
)
model
> Sequential(
  (0): Linear(in_features=4, out_features=3, bias=True)
)
y_pred = model(x_train)
y_pred
> tensor([[-0.3467,  0.0954, -0.5403],
        [-0.3109, -0.0908, -1.3992],
        [-0.1401,  0.1226, -1.3379],
        [-0.4850,  0.0565, -2.1343],
        [-4.1847,  1.6323, -0.7154],
        [-3.6448,  2.2688, -0.0846],
        [-5.1520,  2.1004, -0.9593],
        [-5.2114,  2.1848, -1.0401]], grad_fn=<AddmmBackward0>)

3-1. CrossEntropyLoss

  • ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์†์‹ค ํ•จ์ˆ˜๋Š” Pytorch์—์„œ ์ œ๊ณตํ•˜๋Š” ์†์‹ค ํ•จ์ˆ˜ ์ค‘ ํ•˜๋‚˜๋กœ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ์‚ฌ์šฉ
  • ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜์™€ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ๊ฒฐํ•ฉํ•œ ํ˜•ํƒœ
  • ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜์—ฌ ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํ™•๋ฅ  ๋ถ„ํฌ๋ฅผ ์–ป์Œ
  • ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ๋กœ๊ทธ ํ™•๋ฅ ์„ ๊ณ„์‚ฐ
  • ์‹ค์ œ ๋ผ๋ฒจ๊ณผ ์˜ˆ์ธก ํ™•๋ฅ ์˜ ๋กœ๊ทธ๊ฐ’ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐ
  • ๊ณ„์‚ฐ๋œ ์ฐจ์ด์˜ ํ‰๊ท ์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ตœ์ข… ์†์‹ค ๊ฐ’์„ ์–ป์Œ

3-2. SoftMax

  • ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ํ•จ์ˆ˜๋กœ ์ฃผ์–ด์ง„ ์ž…๋ ฅ ๋ฒกํ„ฐ์˜ ๊ฐ’์„ ํ™•๋ฅ  ๋ถ„ํฌ๋กœ ๋ณ€ํ™˜
  • ๊ฐ ํด๋ž˜์Šค์— ์†ํ•  ํ™•๋ฅ ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๊ฐ ์š”์†Œ๋ฅผ 0๊ณผ 1์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ด ๊ฐ’๋“ค์˜ ํ•ฉ์€ ํ•ญ์ƒ 1์ด ๋˜๋„๋ก ํ•จ
  • ๊ฐ ์ž…๋ ฅ ๊ฐ’์— ๋Œ€ํ•ด ์ง€์ˆ˜ํ•จ์ˆ˜๋ฅผ ์ ์šฉ
  • ์ง€์ˆ˜ ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ ๋ชจ๋“  ๊ฐ’์˜ ํ•ฉ์„ ๊ณ„์‚ฐํ•œ ํ›„, ๊ฐ ์ง€์ˆ˜์˜ ํ•ฉ์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ •๊ทœํ™”
  • ์ •๊ทœํ™”๋ฅผ ํ†ตํ•ด ๊ฐ ๊ฐ’์€ 0๊ณผ 1์‚ฌ์ด์˜ ํ™•๋ฅ  ๊ฐ’์œผ๋กœ ์ถœ๋ ฅ
loss = nn.CrossEntropyLoss()(y_pred, y_train)
loss
> tensor(1.2760, grad_fn=<NllLossBackward0>)
optimizer = optim.SGD(model.parameters(), lr=0.01)
epochs = 10000

for epoch in range(epochs + 1):
  y_pred = model(x_train)
  loss = nn.CrossEntropyLoss()(y_pred, y_train)
  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

  if epoch % 100 == 0:
    print(f'Epoch: {epoch}/{epochs} Loss: {loss: .6f}')
> Epoch: 0/10000 Loss:  1.276001
> Epoch: 100/10000 Loss:  0.701097
> Epoch: 200/10000 Loss:  0.658791
> Epoch: 300/10000 Loss:  0.634072
  ...
> Epoch: 9700/10000 Loss:  0.285849
> Epoch: 9800/10000 Loss:  0.284502
> Epoch: 9900/10000 Loss:  0.283167
> Epoch: 10000/10000 Loss:  0.281846
x_test = torch.FloatTensor([[1, 7, 8, 7]])
y_pred = model(x_test)
y_pred
> tensor([[-10.6285,   1.0027,   4.9401]], grad_fn=<AddmmBackward0>)
# nn.Softmax(1)
# ์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•  ์ฐจ์›์„ ์ง€์ •
# dim=0์ผ ๊ฒฝ์šฐ ์ฒซ ๋ฒˆ์งธ ์ฐจ์›์„ ๋”ฐ๋ผ ์†Œํ”„ํŠธ ๋งฅ์Šค๋ฅผ ๊ณ„์‚ฐ (ํ–‰๋ผ๋ฆฌ ๊ณ„์‚ฐ)
# dim=1์ผ ๊ฒฝ์šฐ ๋‘ ๋ฒˆ์งธ ์ฐจ์›์„ ๋”ฐ๋ผ ์†Œํ”„ํŠธ ๋งฅ์Šค๋ฅผ ๊ณ„์‚ฐ (์—ด๋ผ๋ฆฌ ๊ณ„์‚ฐ)

y_prob = nn.Softmax(1)(y_pred)
y_prob
> tensor([[1.6993e-07, 1.9127e-02, 9.8087e-01]], grad_fn=<SoftmaxBackward0>)
print(f'0์ผ ํ™•๋ฅ : {y_prob[0][0]:.2f}')
print(f'1์ผ ํ™•๋ฅ : {y_prob[0][1]:.2f}')
print(f'2์ผ ํ™•๋ฅ : {y_prob[0][2]:.2f}')
> 0์ผ ํ™•๋ฅ : 0.00
> 1์ผ ํ™•๋ฅ : 0.02
> 2์ผ ํ™•๋ฅ : 0.98
torch.argmax(y_prob, axis=1) # ํ™•๋ฅ ์ด ๊ฐ€์žฅ ๋†’์€ ๊ฐ’
> tensor([2])
profile
์–ด๋–ป๊ฒŒ ํ–„์Šคํ„ฐ๊ฐ€ ๊ฐœ๋ฐœ์ž

0๊ฐœ์˜ ๋Œ“๊ธ€