[DL] PyTorch ์ •๋ฆฌ

fragrance_0ยท2023๋…„ 11์›” 20์ผ
0

DL

๋ชฉ๋ก ๋ณด๊ธฐ
2/6

๋‹ค์–‘ํ•œ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ์กด์žฌํ•˜๋Š”๋ฐ, ๊ทธ ์ค‘์—์„œ๋„ ๊ฐ€์žฅ ๋Œ€ํ‘œ์ ์ธ PyTorch์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋ ค๊ณ  ํ•œ๋‹ค.
๋ƒ…๋‹ค ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์„ ์‹œํ‚ค๊ณ  ํ”„๋กœ์ ํŠธ๋ฅผ ํ•˜๋‹ค๋ณด๋ฉด, ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๊ทธ๋ƒฅ ์ž„ํฌํŠธํ•ด์„œ ์“ฐ๊ฒŒ ๋œ๋‹ค.
PyTorch์— ๋Œ€ํ•ด์„œ ๋” ์•Œ๊ณ  ํ™œ์šฉํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์ „์ฒด์ ์ธ ํ๋ฆ„์— ๋Œ€ํ•ด์„œ ์•Œ๊ณ  ์žˆ๋Š” ๊ฒƒ์ด ์ข‹๋‹ค.

๐ŸŠ ํ”„๋ ˆ์ž„์›Œํฌ: ์ž‘์—…์„ ํšจ์œจ์ ์œผ๋กœ ํ•  ์ˆ˜ ์žˆ๋„๋ก ์งœ๋†“์€ ํ‹€

  • ํŠน์ • ์ž‘์—…์„ ๋„์™€์ค„ ์ˆ˜ ์žˆ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋‚˜ ํ•จ์ˆ˜๋ฅผ ๋ชจ์•„๋‘” ์ง‘ํ•ฉ์ฒด
  • PyTorch, Tensorlfow, Keras ๋“ฑ์ด ์กด์žฌ
  • ์ตœ๊ทผ ๋”ฅ๋Ÿฌ๋‹ ๋ถ„์•ผ์—์„œ๋Š” PyTorch์˜ ๋น„์ค‘์ด ๋†’์•„์ง€๊ณ  ์žˆ์Œ

๐Ÿ“š PYTORCH

  • ๊ณผ๊ฑฐ facebook์—์„œ ๊ฐœ๋ฐœํ•œ ํ”„๋ ˆ์ž„์›Œํฌ

  • Tensor(ํ…์„œ)๋ผ๋Š” ๊ฐœ๋…์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•จ

  • Numpy(๋„˜ํŒŒ์ด) ๋ฐฐ์—ด๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ, ๋”ฅ๋Ÿฌ๋‹์— ์ตœ์ ํ™”๋œ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณต

  • GPU ๊ฐ€์†์„ ์ง€์›ํ•˜์—ฌ ๋” ๋น ๋ฅธ ์—ฐ์‚ฐ ์ˆ˜ํ–‰ ๊ฐ€๋Šฅ

    ํ…์„œ๋ฅผ GPU๋กœ ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ๋„๋ก ์ด๋™
    .to(device)์‹œํ‚ค๋ฉด PyTorch๊ฐ€ ์ž๋™์œผ๋กœ GPU์—์„œ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰

  • Pytorch๋Š” ํ•™์Šต๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๊ณ , ํ•™์Šต๋œ ๋ชจ๋ธ(Pre-trained Model)์„ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ ํŽธํ•จ

  • ๋ชจ๋ธ์˜ ์ƒํƒœ, ์•„ํ‚คํ…์ฒ˜ ๋ฐ ํ•™์Šต๋œ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํŒŒ์ผ์— ์ €์žฅ ๊ฐ€๋Šฅ

๐Ÿ“Ž ๋„˜ํŒŒ์ด

  • Numerical Python์˜ ์ค€๋ง
  • ์—ฐ์‚ฐ์— ์ตœ์ ํ™”๋œ ๊ธฐ๋Šฅ๋“ค์„ ์ง€์›ํ•˜๋Š” python์˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
  • ๋ฆฌ์ŠคํŠธ์™€ ์œ ์‚ฌํ•ด๋ณด์ด๋Š” Array๋ฅผ ์ง€์›ํ•˜๋Š”๋ฐ, ๋น ๋ฅธ ์†๋„์˜ ์—ฐ์‚ฐ
  • ๋ฐฐ์—ด ๋‚ด์— ์—ฌ๋Ÿฌ ์š”์†Œ๋ฅผ ํ•œ๋ฒˆ์— ๊ณ„์‚ฐ ๊ฐ€๋Šฅ
import numpy as np # ์ฃผ๋กœ ์ถ•์•ฝ์–ด์ธ np๋กœ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.

arr = np.array([1, 2, 3, 4, 5])
print(arr + 1)
print(arr * 2)

>>> Output: [2 3 4 5 6]
>>> Output: [2 4 6 8 10]

โญ๏ธ Tensor ์—ฐ์‚ฐ

torch.tensor()

import torch

# Tensor ์ƒ์„ฑ
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])

# Tensor ์—ฐ์‚ฐ ์ˆ˜ํ–‰
z = x + y
print(z)

>>> tensor([5, 7, 9])
  • x์™€ y๋ผ๋Š” ๋‘ ๊ฐœ์˜ ํ…์„œ๋ฅผ ์ƒ์„ฑ
  • '+ ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์š”์†Œ(Element)๋ณ„๋กœ ๋”ํ•จ
  • ๋ฆฌ์ŠคํŠธ๋ผ๋ฆฌ ๋”ํ•˜๋ฉด [1, 2, 3, 4, 5, 6]์˜ ๊ฒฐ๊ณผ
  • ํ…์„œ๋ผ๋ฆฌ ๋”ํ•˜๋ฉด [5, 7, 9]์˜ ๊ฒฐ๊ณผ ์ƒ์„ฑ => ํ…์„œ์—ฐ์‚ฐ

torch.zeros() | torch.ones()

# 0์œผ๋กœ ์ฑ„์›Œ์ง„ ๋ชจ์–‘์ด (3, 4)์ธ ํ…์„œ ์ƒ์„ฑ
zeros_tensor = torch.zeros(3, 4)

>>> tensor([[0., 0., 0., 0.],
		    [0., 0., 0., 0.],
		    [0., 0., 0., 0.]])

# 1๋กœ ์ฑ„์›Œ์ง„ ๋ชจ์–‘์ด (2, 2, 2)์ธ ํ…์„œ ์ƒ์„ฑ
ones_tensor = torch.ones(2, 2, 2)

>>> tensor([[[1., 1.],
		         [1., 1.]],

		        [[1., 1.],
		         [1., 1.]]])


torch.matmul()

# ํ–‰๋ ฌ ๊ณฑ์…ˆ ์ˆ˜ํ–‰
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])
result = torch.matmul(A, B)

>>> tensor([[19, 22],
		    [43, 50]])

  • matmul์€ ํ–‰๋ ฌ๊ณฑ์ธ Matrix Multiplication์˜ ์ค„์ž„๋ง์ž„
    + ํ–‰๋ ฌ๊ณฑ์€ ์›์†Œ๊ฐ„์˜ ๊ณฑ์˜ ํ•ฉ์œผ๋กœ ์ด๋ฃจ์–ด์ง
  • ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์‹œ ์ฃผ๋กœ A์—๋Š” ์ž…๋ ฅ๊ฐ’, B์—๋Š” ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ๊ฐ’์ด ์ฃผ๋กœ ํ• ๋‹น๋จ

โญ๏ธ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ

torch.cuda

  • Pytorch์—์„œ GPU๊ฐ€์†์„ ์œ„ํ•œ ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•จ
  • ํ…์„œ์™€ ๋ชจ๋ธ์„ GPU๋กœ ์ด๋™์‹œํ‚ค๋ฉด GPU์—์„œ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ
  • ๋ชจ๋ธ๊ณผ ํ…์„œ ์ค‘ ํ•˜๋‚˜๋ผ๋„ GPU์— ์ด๋™๋˜์ง€ ์•Š์œผ๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์œผ๋‹ˆ ์ฃผ์˜

# CUDA ์‚ฌ์šฉ ๊ฐ€๋Šฅ ์—ฌ๋ถ€ ํ™•์ธ
if torch.cuda.is_available():
    device = torch.device("cuda")
else:
  print("CUDA๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.")

x = torch.tensor([1, 2, 3]).to(device)  # ํ…์„œ๋ฅผ GPU๋กœ ์ด๋™
y = torch.tensor([4, 5, 6]).to(device)  # ํ…์„œ๋ฅผ GPU๋กœ ์ด๋™
z = x + y  # GPU์—์„œ ๊ณ„์‚ฐ ์ˆ˜ํ–‰
result = z  # ํ…์„œ๋ฅผ ๋‹ค์‹œ CPU๋กœ ์ด๋™

>>> tensor([5, 7, 9], device='cuda:0')

๐Ÿ“Ž CUDA
CUDA(Compute Unified Device Architecture)๋Š” NVIDIA์—์„œ ๊ฐœ๋ฐœํ•œ ๋ณ‘๋ ฌ ์ปดํ“จํŒ… ํ”Œ๋žซํผ ๋ฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๊ฐœ๋ฐœ์ž๋“ค์ด ๊ทธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ ์™ธ์—๋„ NVIDIA GPU(๊ทธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ ์žฅ์น˜)์˜ ์„ฑ๋Šฅ์„ ํ™œ์šฉํ•˜์—ฌ ์ผ๋ฐ˜ ๋ชฉ์ ์˜ ์ปดํ“จํŒ… ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค๋‹ˆ๋‹ค.


torch.nn

-Pytorch์—์„œ๋Š” nn.Module์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ ์ •์˜ ๊ฐ€๋Šฅ

  • ๋ชจ๋ธ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ณ , ๊ทธ ๊ตฌ์กฐ๋ฅผ ์ถœ๋ ฅํ•จ
  • __init__๊ณผ forward ๊ตฌ์กฐ ํŒŒ์•… ์ค‘์š”
import torch
import torch.nn as nn

# ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ ์ •์˜
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        
        # ์ž…๋ ฅ ํฌ๊ธฐ๊ฐ€ 10์ด๊ณ , ์ถœ๋ ฅํฌ๊ธฐ๊ฐ€ 5์ธ ํ•˜๋‚˜์˜ ์™„์ „ ์—ฐ๊ฒฐ์ธต
        self.fc = nn.Linear(10, 5) # Fully-Connected Layer
	
    # forward-> ์ˆœ์ „ํŒŒ ์—ฐ์‚ฐ
    def forward(self, x):
        x = self.fc(x)
        return x

# ๋ชจ๋ธ์˜ ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ
model = Net()

# ๋ชจ๋ธ ๊ตฌ์กฐ ์ถœ๋ ฅ
input = torch.randn(1, 10)
output = model(input)

โญ๏ธ ์†์‹คํ•จ์ˆ˜์™€ ์—ญ์ „ํŒŒ

torch.nn.Loss() | loss.backward()

  • ์†์‹คํ•จ์ˆ˜์™€ ์—ญ์ „ํŒŒ๋ฅผ ๊ตฌํ•  ๋•Œ, Pytorch์˜ ์žฅ์ ์ด ๋‘๋“œ๋Ÿฌ์ง

  • ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ฝ”๋“œ๋กœ ์†์‹ค๊ณ„์‚ฐ, ์—ญ์ „ํŒŒ ์‰ฝ๊ฒŒ ๊ตฌํ˜„ ๊ฐ€๋Šฅ

  • ์†์‹คํ•จ์ˆ˜๋ฅผ MSE(ํ‰๊ท ์ œ๊ณฑ์˜ค์ฐจ)๋กœ ์ •์˜: nn.MSELoss()
    -> ์ž…๋ ฅ ํ…์„œ์™€ ํƒ€๊ฒŸ ํ…์„œ๋ฅผ ๋น„๊ตํ•ด ์†์‹ค๊ฐ’ ๊ณ„์‚ฐ

  • loss.backward()๋กœ ์—ญ์ „ํŒŒ ์ˆ˜ํ–‰

  • ์ž…๋ ฅ ํ…์„œ์— ๋Œ€ํ•œ ์†์‹ค์˜ ๊ธฐ์šธ๊ธฐ๋ฅผ ์ถœ๋ ฅ

import torch
import torch.nn as nn

# ๋ฌด์ž‘์œ„ ์ž…๋ ฅ๊ณผ ํƒ€๊ฒŸ ํ…์„œ ์ƒ์„ฑ
input = torch.randn(3, requires_grad=True)
target = torch.tensor([0.5, -1, 2])

# ์†์‹ค ํ•จ์ˆ˜ ์ •์˜
loss_fn = nn.MSELoss()

# ์†์‹ค๊ฐ’ ๊ณ„์‚ฐ
loss = loss_fn(input, target)

# ์—ญ์ „ํŒŒ ์ˆ˜ํ–‰
loss.backward()

# ๊ธฐ์šธ๊ธฐ ์ถœ๋ ฅ
print(input.grad)

โญ๏ธ ์˜ตํ‹ฐ๋งˆ์ด์ €

torch.optim

import torch
import torch.nn as nn
import torch.optim as optim

# ๋ฌด์ž‘์œ„ ๋ชจ๋ธ๊ณผ ์ž…๋ ฅ Tensor ์ƒ์„ฑ
model = nn.Linear(5,1)
input = torch.randn(2, 5)

# ์˜ตํ‹ฐ๋งˆ์ด์ € ์ •์˜ -> SGD(ํ™•๋ฅ ์  ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•)์œผ๋กœ ์ •์˜
optimizer = optim.SGD(model.parameters(), lr=0.1)

# ๊ธฐ์šธ๊ธฐ ์ดˆ๊ธฐํ™”
optimizer.zero_grad()

# ์ˆœ์ „ํŒŒ
output = model(input)

# ์†์‹ค ๊ณ„์‚ฐ
loss = output.mean()

# ์—ญ์ „ํŒŒ
loss.backward()

# ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ
optimizer.step()

# ์—…๋ฐ์ดํŠธ๋œ ํŒŒ๋ผ๋ฏธํ„ฐ ์ถœ๋ ฅ
print(model.weight)
  • SGD: Stochastic Gradient Descent, ํ™•๋ฅ ์  ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•
  • optimizer.zero_grad(): ๊ธฐ์šธ๊ธฐ ์ดˆ๊ธฐํ™”
  • Ir: Learning Rate, ํ•™์Šต๋ฅ 
  • ์ž…๋ ฅ ํ…์„œ๋ฅผ ๋ชจ๋ธ์— ํ†ต๊ณผ์‹œ์ผœ ์ˆœ์ „ํŒŒ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ณ , ์ถœ๋ ฅ์˜ ํ‰๊ท ์„ ์†์‹คํ•จ์ˆ˜๋กœ ๊ณ„์‚ฐ
  • loss.backward()์™€ optimizer.step()๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ์—ญ์ „ํŒŒ์™€ ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ๋ฅผ ์ˆ˜ํ–‰

โญ๏ธ ๋ชจ๋ธ ์ €์žฅ๊ณผ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • ๋งค๋ฒˆ ์ƒˆ๋กญ๊ฒŒ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์ด ์•„๋‹Œ, ํ•™์Šต์ด ์™„๋ฃŒ๋œ ๋ชจ๋ธ์„ ์ €์žฅ
  • torch.save()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‰ฝ๊ฒŒ ์ €์žฅ => .pt ๋˜๋Š” .pth ํ™•์žฅ์ž๋กœ ์ €์žฅ
  • torch.load()๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ €์žฅ๋œ ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์˜ด
  • model.load_state_dict()๋ฅผ ์‚ฌ์šฉํ•ด model๋ณ€์ˆ˜์— ๊ธฐ์กด ๊ฐ€์ค‘์น˜ ๊ฐ’ ํ• ๋‹น

torch.save() | torch.load()

import torch
import torch.nn as nn

# ๊ฐ„๋‹จํ•œ ๋ชจ๋ธ ์ •์˜ => ์œ„์—์„œ ์ •์˜ํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ๊ณผ ๊ตฌ์กฐ๊ฐ€ ๋™์ผ
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = nn.Linear(10, 5)

    def forward(self, x):
        x = self.fc(x)
        return x

# ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ
model = Net()

# ๋ชจ๋ธ ์ƒํƒœ ์ €์žฅ
torch.save(model.state_dict(), 'model.pth')

# ๋ชจ๋ธ ์ƒํƒœ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
model.load_state_dict(torch.load('model.pth'))

# ๋ถˆ๋Ÿฌ์˜จ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก  ์ˆ˜ํ–‰
input = torch.randn(1, 10)
output = model(input)
print(output)

[์ถœ์ฒ˜ | ๋”ฅ๋‹ค์ด๋ธŒ Code.zip ๋งค๊ฑฐ์ง„]

profile
@fragrance_0์˜ ๊ฐœ๋ฐœ๋กœ๊ทธ

0๊ฐœ์˜ ๋Œ“๊ธ€