๐ŸŽฒ[AI] ์ด๋ฏธ์ง€๋กœ ๋‚ ์”จ ๋งž์ถฐ๋ณด๊ธฐ (๋‹จ์ˆœ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ vs ์‹ฌ์ธต ๋ชจ๋ธ)

manduยท2025๋…„ 4์›” 20์ผ

[AI]

๋ชฉ๋ก ๋ณด๊ธฐ
4/20

ํ•ด๋‹น ๊ธ€์€ FastCampus - '๋ชจ๋‘๋ฅผ ์œ„ํ•œ 2025 AI ๋ฐ”์ด๋ธ” : AI Signature' ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ ,
์ถ”๊ฐ€ ํ•™์Šตํ•œ ๋‚ด์šฉ์„ ๋ง๋ถ™์—ฌ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.

1. ๋ฐ์ดํ„ฐ ์…‹ cloning

!git clone https://github.com/ndb796/weather_dataset
%cd weather_dataset # ํ˜„์žฌ ์ž‘์—… ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ weather_dataset ํด๋”๋กœ ๋ณ€๊ฒฝ

2. ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ธํŒ…

import torch
import torchvision
import torchvision.transforms as transforms
import torchvision.models as models
import torchvision.datasets as datasets

import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import random_split

import matplotlib.pyplot as plt
import matplotlib.image as image
import numpy as np

3. ๋ฐ์ดํ„ฐ ์„ธํŠธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ(Load Dataset)

  1. Data Augmentation์„ ๋ช…์‹œํ•˜์—ฌ ์ดˆ๊ธฐํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.
    • ์ด๋ฏธ์ง€๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ ์–ด๋–ค ๋ฐฉ๋ฒ•(ํšŒ์ „, ์ž๋ฅด๊ธฐ, ๋’ค์ง‘๊ธฐ ๋“ฑ)์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ธ์ง€ ๋ช…์‹œํ•œ๋‹ค.
  2. ์ดํ›„์— DataLoader()๋ฅผ ์ด์šฉํ•˜์—ฌ ์‹ค์งˆ์ ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค.
    • ์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ธ์ง€, ๋ฐฐ์น˜ ํฌ๊ธฐ(batch size), ๋ฐ์ดํ„ฐ ์…”ํ”Œ(shuffle) ์—ฌ๋ถ€ ๋“ฑ์„ ๋ช…์‹œํ•œ๋‹ค.
    • next() ํ•จ์ˆ˜ ์ด์šฉํ•˜์—ฌ tensor ํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

Data Augmentation
ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ๋‹ค์–‘ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ธฐ๋ฒ•
์˜ˆ: ์ด๋ฏธ์ง€๋ฅผ ํšŒ์ „, ๋’ค์ง‘๊ธฐ, ๋ฐ๊ธฐ ์กฐ์ ˆ ๋“ฑ์œผ๋กœ ๋‹ค์–‘ํ•˜๊ฒŒ ๋ณ€ํ˜•

# ์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌ: ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•  torchvision์˜ transforms๋กœ ์ˆœ์ฐจ์  ์ „์ฒ˜๋ฆฌ ์ •์˜
transform_train = transforms.Compose([
    transforms.Resize((256, 256)),           # ๋ชจ๋“  ์ด๋ฏธ์ง€๋ฅผ 256x256 ํฌ๊ธฐ๋กœ resize by Bilinear interpolation (2D)
    										 # ์ƒˆ ํ”ฝ์…€ ๊ฐ’์„ ๊ณ„์‚ฐํ•  ๋•Œ, ์ฃผ๋ณ€ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด 2ร—2 ํ”ฝ์…€(์ด 4๊ฐœ)์„ ์ฐธ๊ณ ํ•ด์„œ ๋ณด๊ฐ„
    transforms.RandomHorizontalFlip(),       # ์ขŒ์šฐ๋กœ ์ด๋ฏธ์ง€ ๋’ค์ง‘๊ธฐ (๋ฐ์ดํ„ฐ ์ฆ๊ฐ•)
    transforms.ToTensor(),                   # PIL ์ด๋ฏธ์ง€ โ†’ Tensor๋กœ ๋ณ€ํ™˜, [0, 1] ์Šค์ผ€์ผ๋ง
    transforms.Normalize(                    # ์ด๋ฏธ์ง€ ์ •๊ทœํ™” (0~1 โ†’ -1~1)
        mean=[0.5, 0.5, 0.5],                 # RGB ํ‰๊ท 
        std=[0.5, 0.5, 0.5]                   # RGB ํ‘œ์ค€ํŽธ์ฐจ
    )
])

# ๊ฒ€์ฆ(validation)์šฉ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ: ํ•™์Šต๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ์—†์Œ
transform_val = transforms.Compose([
    transforms.Resize((256, 256)),           # ํฌ๊ธฐ๋งŒ ์กฐ์ •
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.5, 0.5, 0.5],
        std=[0.5, 0.5, 0.5]
    )
])

# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ๋„ ๊ฒ€์ฆ๊ณผ ๋™์ผ (์ผ๊ด€์„ฑ ์žˆ๊ฒŒ ์ฆ๊ฐ• ์—†์ด)
transform_test = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.5, 0.5, 0.5],
        std=[0.5, 0.5, 0.5]
    )
])

# ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ์…‹ ๋กœ๋”ฉ (ImageFolder: ํด๋”๋ช…์„ ๋ผ๋ฒจ๋กœ ์ž๋™ ๋งคํ•‘ํ•ด์คŒ)
train_dataset = datasets.ImageFolder(
    root='train/',                   # ํ•™์Šต ์ด๋ฏธ์ง€๊ฐ€ ๋“ค์–ด ์žˆ๋Š” ํด๋”
    transform=transform_train        # ์œ„์—์„œ ์ •์˜ํ•œ transform ์ ์šฉ
)

# ์ „์ฒด ํ•™์Šต ๋ฐ์ดํ„ฐ ์ค‘ ์ผ๋ถ€๋ฅผ ๊ฒ€์ฆ์šฉ์œผ๋กœ ๋ถ„ํ• 
dataset_size = len(train_dataset)   # ์ „์ฒด ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜
train_size = int(dataset_size * 0.8)  # 80% ํ•™์Šต์šฉ
val_size = dataset_size - train_size  # 20% ๊ฒ€์ฆ์šฉ

# ํ•™์Šต / ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ๋ถ„๋ฆฌ
train_dataset, val_dataset = random_split(train_dataset, [train_size, val_size])
val_dataset.dataset.transform = transform_val

# ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹ ๋กœ๋”ฉ (๋ผ๋ฒจ๋„ ์ž๋™ ๋งคํ•‘๋จ)
test_dataset = datasets.ImageFolder(
    root='test/',
    transform=transform_test
)

# ๋ฐ์ดํ„ฐ๋กœ๋”(DataLoader) ์„ค์ •: ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ๋”ฉ
train_dataloader = torch.utils.data.DataLoader(
    train_dataset, batch_size=64, shuffle=True   # ํ•™์Šต์šฉ: ๋งค epoch๋งˆ๋‹ค ์„ž๊ธฐ
)
val_dataloader = torch.utils.data.DataLoader(
    val_dataset, batch_size=64, shuffle=False    # ๊ฒ€์ฆ์šฉ: ์ˆœ์„œ ๊ณ ์ •
)
test_dataloader = torch.utils.data.DataLoader(
    test_dataset, batch_size=64, shuffle=False   # ํ…Œ์ŠคํŠธ์šฉ: ์ˆœ์„œ ๊ณ ์ •
)

4. ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”(Data Visualization)


# ์‹œ๊ฐํ™” ์„ค์ •
plt.rcParams['figure.figsize'] = [12, 8]  # ๊ทธ๋ž˜ํ”„ ํฌ๊ธฐ ์„ค์ •
plt.rcParams['figure.dpi'] = 60           # ํ•ด์ƒ๋„ ์„ค์ •
plt.rcParams.update({'font.size': 20})    # ๊ธ€๊ผด ํฌ๊ธฐ ์„ค์ •

# ์ด๋ฏธ์ง€ ์ถœ๋ ฅ ํ•จ์ˆ˜ ์ •์˜
def imshow(input):
    # ํ…์„œ โ†’ ๋„˜ํŒŒ์ด๋กœ ๋ณ€ํ™˜, ์ฑ„๋„ ์ˆœ์„œ๋„ (C, H, W) โ†’ (H, W, C)๋กœ ๋ณ€๊ฒฝ
    input = input.numpy().transpose((1, 2, 0))
    
    # ์ •๊ทœํ™” ํ•ด์ œ (์›๋ž˜ ์ด๋ฏธ์ง€ ๊ฐ’์œผ๋กœ ๋ณต์›)
    mean = np.array([0.5, 0.5, 0.5])
    std = np.array([0.5, 0.5, 0.5])
    input = std * input + mean
    input = np.clip(input, 0, 1)  # ๊ฐ’ ๋ฒ”์œ„ 0~1๋กœ ์ œํ•œ
    
    # ์ด๋ฏธ์ง€ ์ถœ๋ ฅ
    plt.imshow(input)
    plt.show()

# ํด๋ž˜์Šค ๋ผ๋ฒจ ์ด๋ฆ„ ์ •์˜
class_names = {
  0: "Cloudy",
  1: "Rain",
  2: "Shine",
  3: "Sunrise"
}

# ํ•™์Šต ์ด๋ฏธ์ง€ ๋ฐฐ์น˜ ํ•˜๋‚˜ ๋กœ๋“œ
iterator = iter(train_dataloader)  # DataLoader๋ฅผ ์ดํ„ฐ๋ ˆ์ดํ„ฐ๋กœ ๋ณ€ํ™˜
imgs, labels = next(iterator)     # ์ฒซ ๋ฐฐ์น˜ ๋กœ๋“œ

# ์ด๋ฏธ์ง€ ๊ทธ๋ฆฌ๋“œ๋กœ ๋ฌถ๊ธฐ (์•ž์—์„œ 4์žฅ๋งŒ ๋ณด๊ธฐ)
out = torchvision.utils.make_grid(imgs[:4])

# ์ด๋ฏธ์ง€ ์ถœ๋ ฅ
imshow(out)

# ํ•ด๋‹น ์ด๋ฏธ์ง€๋“ค์˜ ํด๋ž˜์Šค ์ด๋ฆ„ ์ถœ๋ ฅ
print([class_names[labels[i].item()] for i in range(4)])

Output

['Rain', 'Sunrise', 'Sunrise', 'Shine']


5. ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ•™์Šต(Training)

  • ๋ ˆ์ด์–ด์˜ ๊นŠ์ด๋ฅผ ๋Š˜๋ฆฌ๊ณ  ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐœ์ˆ˜๋ฅผ ์ฆ๊ฐ€์‹œ์ผœ ๋ณด๋ฉด์„œ,
    ๋‹จ์ผ ์„ ํ˜•์ธต๋งŒ ์žˆ๋Š” ์•„์ฃผ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ๋ถ€ํ„ฐ, ์€๋‹‰์ธต + ๋“œ๋กญ ์•„์›ƒ์ด ํฌํ•จ๋œ ์‹ฌ์ธต ๋ชจ๋ธ๊นŒ์ง€,
    ์ด ์„ธ ๊ฐœ์˜ ๋ชจ๋ธ์„ ๊ฐ๊ฐ ํ•™์Šต
  • ์€๋‹‰์ธต: Input layer์™€ Output layer ์‚ฌ์ด ๋ชจ๋“  ์ธต์„ ๋งํ•จ
  • ๋“œ๋กญ ์•„์›ƒ: ๊ณผ์ ํ•ฉ(overfitting)์„ ๋ง‰๊ธฐ ์œ„ํ•œ ๊ทœ์ œ(regularization) ๊ธฐ๋ฒ• ์ค‘ ํ•˜๋‚˜๋กœ, ํ•™์Šตํ•  ๋•Œ ์ผ๋ถ€ ๋‰ด๋Ÿฐ์„ ๋ฌด์ž‘์œ„๋กœ ๊บผ๋ฒ„๋ฆฌ๋Š” ๊ฒƒ

๋ชจ๋ธ ์ •์˜

import torch.nn as nn
import torch.nn.functional as F

# ๋‹จ์ผ ์„ ํ˜•์ธต๋งŒ ์žˆ๋Š” ์•„์ฃผ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ
class Model1(nn.Module):
    def __init__(self):
        super(Model1, self).__init__() # forward, parameters(), .to(device) ๊ฐ™์€ ๋ถ€๋ชจ ํด๋ž˜์Šค ๊ธฐ๋Šฅ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋ถ€๋ชจ ํด๋ž˜์Šค ์ดˆ๊ธฐํ™”
        self.flatten = nn.Flatten()
        self.linear1 = nn.Linear(256 * 256 * 3, 4)  # ์ž…๋ ฅ์„ flatten ํ›„ ๋ฐ”๋กœ ํด๋ž˜์Šค 4๊ฐœ๋กœ ๋งคํ•‘

    def forward(self, x):
        x = self.flatten(x)      # ์ด๋ฏธ์ง€(3x256x256) โ†’ 1D ๋ฒกํ„ฐ
        x = self.linear1(x)      # ์„ ํ˜• ๋ณ€ํ™˜
        return x


# ์€๋‹‰์ธต(hidden layer) 1๊ฐœ ์ถ”๊ฐ€๋œ ๋ชจ๋ธ
class Model2(nn.Module):
    def __init__(self):
        super(Model2, self).__init__()
        self.flatten = nn.Flatten()
        self.linear1 = nn.Linear(256 * 256 * 3, 64)  # ์ฒซ ๋ฒˆ์งธ ์€๋‹‰์ธต
        self.linear2 = nn.Linear(64, 4)              # ์ถœ๋ ฅ์ธต

    def forward(self, x):
        x = self.flatten(x)
        x = self.linear1(x)      # ์€๋‹‰์ธต (ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์—†์Œ)
        x = self.linear2(x)
        return x


# ์€๋‹‰์ธต + ๋“œ๋กญ์•„์›ƒ์ด ํฌํ•จ๋œ ์‹ฌ์ธต ๋ชจ๋ธ
class Model3(nn.Module):
    def __init__(self):
        super(Model3, self).__init__()
        self.flatten = nn.Flatten()
        self.linear1 = nn.Linear(256 * 256 * 3, 128)
        self.dropout1 = nn.Dropout(0.5)
        self.linear2 = nn.Linear(128, 64)
        self.dropout2 = nn.Dropout(0.5)
        self.linear3 = nn.Linear(64, 32)
        self.dropout3 = nn.Dropout(0.5)
        self.linear4 = nn.Linear(32, 4)  # ์ตœ์ข… ํด๋ž˜์Šค 4๊ฐœ ์ถœ๋ ฅ

    def forward(self, x):
        x = self.flatten(x)
        x = F.relu(self.linear1(x))     # ReLU + Dropout ๋ฐ˜๋ณต
        x = self.dropout1(x)
        x = F.relu(self.linear2(x))
        x = self.dropout2(x)
        x = F.relu(self.linear3(x))
        x = self.dropout3(x)
        x = self.linear4(x)
        return x
        
# resnet50(ImageNet==๋Œ€๊ทœ๋ชจ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์…‹์„ ์ด๋ฏธ ํ›ˆ๋ จ์‹œํ‚จ ๋ชจ๋ธ)์„ transfer learning(์ด๋ฏธ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์™€์„œ ๋ชฉ์ ์— ๋งž๊ฒŒ ์žฌ์‚ฌ์šฉ)
class ModelTransfer(nn.Module):
    def __init__(self):
        super(ModelTransfer, self).__init__()
        # ResNet50 ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
        model = models.resnet50(pretrained=True)
        num_features = model.fc.in_features # ๋งˆ์ง€๋ง‰ FC ๋ ˆ์ด์–ด์— ๋“ค์–ด์˜ค๋Š” ์ž…๋ ฅ ํฌ๊ธฐ (2048)
        # model.fc.in_features โ†’ 2048 (์ž…๋ ฅ ํฌ๊ธฐ)
        # model.fc.out_features โ†’ 1000 (์ถœ๋ ฅ ํฌ๊ธฐ)
        model.fc = nn.Linear(num_features, 4)  # fc(fully connected layer, ๋งˆ์ง€๋ง‰ Layer)์—์„œ output ์ฐจ์›์„ 1000๊ฐœ ํด๋ž˜์Šค ๋Œ€์‹  ๋‚ด๊ฐ€ ์›ํ•˜๋Š” 4๊ฐœ ํด๋ž˜์Šค ๋ถ„๋ฅ˜๊ธฐ๋กœ ๊ต์ฒด
        self.model = model  # self.model๋กœ ์ €์žฅ

    def forward(self, x): # ๋ฉ”์„œ๋“œ ์˜ค๋ฒ„๋ผ์ด๋”ฉ
        return self.model(x)  # forward ๋ฉ”์„œ๋“œ์—์„œ ๋ชจ๋ธ ์‹คํ–‰

๋ชจ๋ธ ํ•™์Šต, ๊ฒ€์ฆ, ํ…Œ์ŠคํŠธ ํ•จ์ˆ˜ ์ •์˜

import time

# ํ•™์Šต ํ•จ์ˆ˜
def train():
    start_time = time.time()
    print(f'[Epoch: {epoch + 1} - Training]')
    model.train()  # ๋ชจ๋ธ์„ ํ•™์Šต ๋ชจ๋“œ๋กœ ์„ค์ •
    total = 0
    running_loss = 0.0
    running_corrects = 0

    for i, batch in enumerate(train_dataloader):
        imgs, labels = batch
        imgs, labels = imgs.cuda(), labels.cuda()  # GPU๋กœ ์ด๋™

        outputs = model(imgs)              # ์ˆœ์ „ํŒŒ
        optimizer.zero_grad()              # ์ด์ „ ๊ธฐ์šธ๊ธฐ ์ดˆ๊ธฐํ™”
        _, preds = torch.max(outputs, 1)   # ํด๋ž˜์Šค ์ฐจ์›(1๋ฒˆ ์ถ•)์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’์˜ index๋ฅผ ๋ฐ˜ํ™˜
        loss = criterion(outputs, labels)  # ์†์‹ค ๊ณ„์‚ฐ

        loss.backward()        # ์—ญ์ „ํŒŒ โ†’ ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ
        optimizer.step()       # ํŒŒ๋ผ๋ฏธํ„ฐ ์—…๋ฐ์ดํŠธ

        total += labels.shape[0]
        running_loss += loss.item() # ์—ฐ์‚ฐ ๊ทธ๋ž˜ํ”„์™€ loss ๊ฐ’์„ ๋‹ด๊ณ  ์žˆ๋Š” tensor์—์„œ loss ์ˆซ์ž(float)๋ฅผ ๊บผ๋‚ด๋Š” ํ•จ์ˆ˜
        running_corrects += torch.sum(preds == labels.data)

        # ์ผ์ • step๋งˆ๋‹ค ๋กœ๊ทธ ์ถœ๋ ฅ
        if i % log_step == log_step - 1:
            print(f'[Batch: {i + 1}] running train loss: {running_loss / total}, running train accuracy: {running_corrects / total}')

    print(f'train loss: {running_loss / total}, accuracy: {running_corrects / total}')
    print("elapsed time:", time.time() - start_time)
    return running_loss / total, (running_corrects / total).item()


# ๊ฒ€์ฆ ํ•จ์ˆ˜
def validate():
    start_time = time.time()
    print(f'[Epoch: {epoch + 1} - Validation]')
    model.eval()  # ๋ชจ๋ธ์„ ํ‰๊ฐ€ ๋ชจ๋“œ๋กœ ์„ค์ •
    total = 0
    running_loss = 0.0
    running_corrects = 0

    for i, batch in enumerate(val_dataloader):
        imgs, labels = batch
        imgs, labels = imgs.cuda(), labels.cuda()

        with torch.no_grad():  # ๊ธฐ์šธ๊ธฐ ์ถ”์  ๋น„ํ™œ์„ฑํ™” โ†’ ๋” ๋น ๋ฅด๊ณ  ๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ
            outputs = model(imgs)
            max_values, preds = torch.max(outputs, 1) 
            loss = criterion(outputs, labels)

        total += labels.shape[0]
        running_loss += loss.item()
        running_corrects += torch.sum(preds == labels.data)

        if (i == 0) or (i % log_step == log_step - 1):
            print(f'[Batch: {i + 1}] running val loss: {running_loss / total}, running val accuracy: {running_corrects / total}')

    print(f'val loss: {running_loss / total}, accuracy: {running_corrects / total}')
    print("elapsed time:", time.time() - start_time)
    return running_loss / total, (running_corrects / total).item()


# ํ…Œ์ŠคํŠธ ํ•จ์ˆ˜ (๊ฒ€์ฆ๊ณผ ๊ฑฐ์˜ ๋™์ผ)
def test():
    start_time = time.time()
    print(f'[Test]')
    model.eval()
    total = 0
    running_loss = 0.0
    running_corrects = 0

    for i, batch in enumerate(test_dataloader):
        imgs, labels = batch
        imgs, labels = imgs.cuda(), labels.cuda()

        with torch.no_grad():
            outputs = model(imgs)
            _, preds = torch.max(outputs, 1)
            loss = criterion(outputs, labels)

        total += labels.shape[0]
        running_loss += loss.item()
        running_corrects += torch.sum(preds == labels.data)

        if (i == 0) or (i % log_step == log_step - 1):
            print(f'[Batch: {i + 1}] running test loss: {running_loss / total}, running test accuracy: {running_corrects / total}')

    print(f'test loss: {running_loss / total}, accuracy: {running_corrects / total}')
    print("elapsed time:", time.time() - start_time)
    return running_loss / total, (running_corrects / total).item()

ํ•™์Šต๋ฅ (learning rate)์„ epoch ์ˆ˜์— ๋”ฐ๋ผ ์กฐ์ •ํ•˜๋Š” ํ•จ์ˆ˜

  • ์ดˆ๋ฐ˜์—๋Š” ํฐ ํ•™์Šต๋ฅ ๋กœ ๋น ๋ฅด๊ฒŒ ํ•™์Šต
    โ†’ ๋น ๋ฅด๊ฒŒ ์ตœ์ ์  ๊ทผ์ฒ˜๋กœ ์ด๋™

  • ํ›„๋ฐ˜์—๋Š” ์ž‘์€ ํ•™์Šต๋ฅ ๋กœ ๋ฏธ์„ธํ•˜๊ฒŒ ํŠœ๋‹
    โ†’ ์ด๋ฏธ ์–ด๋А ์ •๋„ ์ข‹์€ ์ง€์ ์— ๋„๋‹ฌ, ๋„ˆ๋ฌด ํฌ๊ฒŒ ์›€์ง์ด๋ฉด ์˜คํžˆ๋ ค ์ตœ์ ์ ์„ ์ง€๋‚˜์ณ ๋ฒ„๋ฆด ์ˆ˜ ์žˆ์Œ

  • ์˜ค๋ฒ„ ํ”ผํŒ…์„ ์ค„์ด๊ณ  ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์ค„์—ฌ ๋” ์ข‹์€ ์„ฑ๋Šฅ๊ณผ ์•ˆ์ •์ ์ธ ์ˆ˜๋ ด์„ ์–ป์„ ์ˆ˜ ์žˆ์Œ

import time

def adjust_learning_rate(optimizer, epoch):
    lr = learning_rate  # ์ดˆ๊ธฐ ํ•™์Šต๋ฅ ์„ ๊ฐ€์ ธ์˜ด

    # epoch >= 3์ผ ๋•Œ ํ•™์Šต๋ฅ ์„ 1/10๋กœ ๊ฐ์†Œ
    if epoch >= 3:
        lr /= 10
    # epoch >= 7์ผ ๋•Œ ๋‹ค์‹œ ํ•œ ๋ฒˆ 1/10๋กœ ๊ฐ์†Œ (์ฆ‰, ์›๋ž˜์˜ 1/100)
    if epoch >= 7:
        lr /= 10

    # optimizer์— ์žˆ๋Š” ๋ชจ๋“  ํŒŒ๋ผ๋ฏธํ„ฐ ๊ทธ๋ฃน์˜ ํ•™์Šต๋ฅ ์„ ๋ณ€๊ฒฝ
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

6. ํ•™์Šต ๊ฒฐ๊ณผ ํ™•์ธ

Model 1: ๋‹จ์ผ ์„ ํ˜•์ธต๋งŒ ์žˆ๋Š” ์•„์ฃผ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ

learning_rate = 0.01
log_step = 20

model = Model1()
model = model.cuda()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

num_epochs = 20
best_val_acc = 0
best_epoch = 0

history = []
accuracy = []
for epoch in range(num_epochs):
    adjust_learning_rate(optimizer, epoch)
    train_loss, train_acc = train()
    val_loss, val_acc = validate()
    history.append((train_loss, val_loss))
    accuracy.append((train_acc, val_acc))

    if val_acc > best_val_acc:
        print("[Info] best validation accuracy!")
        best_val_acc = val_acc
        best_epoch = epoch
        torch.save(model.state_dict(), f"best_checkpoint_epoch_{epoch + 1}.pth")

torch.save(model.state_dict(), f"last_checkpoint_epoch_{num_epochs}.pth")

plt.plot([x[0] for x in accuracy], 'b', label='train')
plt.plot([x[1] for x in accuracy], 'r--',label='validation')
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

test_loss, test_accuracy = test()
print(f"Test loss: {test_loss:.8f}")
print(f"Test accuracy: {test_accuracy * 100.:.2f}%")

Output

[Epoch: 1 - Training]
train loss: 0.2685530501824838, accuracy: 0.6029629707336426
elapsed time: 6.362705230712891
[Epoch: 1 - Validation]
[Batch: 1] running val loss: 0.25105223059654236, running val accuracy: 0.65625
val loss: 0.36737381071734004, accuracy: 0.692307710647583
elapsed time: 1.214353322982788
[Info] best validation accuracy!


...

[Epoch: 20 - Training]
train loss: 0.05483319741708261, accuracy: 0.8414815068244934
elapsed time: 5.412205696105957
[Epoch: 20 - Validation]
[Batch: 1] running val loss: 0.1295606642961502, running val accuracy: 0.734375
val loss: 0.1441787206209623, accuracy: 0.7337278127670288
elapsed time: 1.2089176177978516
[Test]
[Batch: 1] running test loss: 0.2731671631336212, running test accuracy: 0.625
test loss: 0.17309151234575862, accuracy: 0.7295373678207397
elapsed time: 0.9299411773681641
Test loss: 0.17309151
Test accuracy: 72.95%


Model 2: ์€๋‹‰์ธต(hidden layer) 1๊ฐœ ์ถ”๊ฐ€๋œ ๋ชจ๋ธ

learning_rate = 0.01
log_step = 20

model = Model2()
model = model.cuda()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

num_epochs = 20
best_val_acc = 0
best_epoch = 0

history = []
accuracy = []
for epoch in range(num_epochs):
    adjust_learning_rate(optimizer, epoch)
    train_loss, train_acc = train()
    val_loss, val_acc = validate()
    history.append((train_loss, val_loss))
    accuracy.append((train_acc, val_acc))

    if val_acc > best_val_acc:
        print("[Info] best validation accuracy!")
        best_val_acc = val_acc
        best_epoch = epoch
        torch.save(model.state_dict(), f"best_checkpoint_epoch_{epoch + 1}.pth")

torch.save(model.state_dict(), f"last_checkpoint_epoch_{num_epochs}.pth")

plt.plot([x[0] for x in accuracy], 'b', label='train')
plt.plot([x[1] for x in accuracy], 'r--',label='validation')
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

test_loss, test_accuracy = test()
print(f"Test loss: {test_loss:.8f}")
print(f"Test accuracy: {test_accuracy * 100.:.2f}%")

Output

[Epoch: 1 - Training]
train loss: 0.062059673556575067, accuracy: 0.5688889026641846
elapsed time: 7.238025426864624
[Epoch: 1 - Validation]
[Batch: 1] running val loss: 0.10530433803796768, running val accuracy: 0.359375
val loss: 0.10420821263239934, accuracy: 0.4201183617115021
elapsed time: 1.2768044471740723
[Info] best validation accuracy!

...

[Epoch: 20 - Training]
train loss: 0.04108952063101309, accuracy: 0.7881481647491455
elapsed time: 7.323403835296631
[Epoch: 20 - Validation]
[Batch: 1] running val loss: 0.06464774161577225, running val accuracy: 0.609375
val loss: 0.0676442956077982, accuracy: 0.6568047404289246
elapsed time: 1.3936107158660889
[Test]
[Batch: 1] running test loss: 0.10629818588495255, running test accuracy: 0.484375
test loss: 0.07723092860492523, accuracy: 0.6832740306854248
elapsed time: 1.1223466396331787
Test loss: 0.07723093
Test accuracy: 68.33%

Model 3: ์€๋‹‰์ธต + ๋“œ๋กญ์•„์›ƒ์ด ํฌํ•จ๋œ ์‹ฌ์ธต ๋ชจ๋ธ

learning_rate = 0.01
log_step = 20

model = Model3()
model = model.cuda()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

num_epochs = 20
best_val_acc = 0
best_epoch = 0

history = []
accuracy = []
for epoch in range(num_epochs):
    adjust_learning_rate(optimizer, epoch)
    train_loss, train_acc = train()
    val_loss, val_acc = validate()
    history.append((train_loss, val_loss))
    accuracy.append((train_acc, val_acc))

    if val_acc > best_val_acc:
        print("[Info] best validation accuracy!")
        best_val_acc = val_acc
        best_epoch = epoch
        torch.save(model.state_dict(), f"best_checkpoint_epoch_{epoch + 1}.pth")

torch.save(model.state_dict(), f"last_checkpoint_epoch_{num_epochs}.pth")

plt.plot([x[0] for x in accuracy], 'b', label='train')
plt.plot([x[1] for x in accuracy], 'r--',label='validation')
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

test_loss, test_accuracy = test()
print(f"Test loss: {test_loss:.8f}")
print(f"Test accuracy: {test_accuracy * 100.:.2f}%")

Output

[Epoch: 1 - Training]
train loss: 0.021055312863102665, accuracy: 0.3733333349227905
elapsed time: 7.819769620895386
[Epoch: 1 - Validation]
[Batch: 1] running val loss: 0.014086698181927204, running val accuracy: 0.671875
val loss: 0.014450000588005111, accuracy: 0.6449704170227051
elapsed time: 1.6248998641967773
[Info] best validation accuracy!

...

[Epoch: 20 - Training]
train loss: 0.013426496452755399, accuracy: 0.6651852130889893
elapsed time: 7.245265483856201
[Epoch: 20 - Validation]
[Batch: 1] running val loss: 0.010319100692868233, running val accuracy: 0.8125
val loss: 0.011233355166644034, accuracy: 0.7869822382926941
elapsed time: 1.279524326324463
[Test]
[Batch: 1] running test loss: 0.016797732561826706, running test accuracy: 0.78125
test loss: 0.011137271594932283, accuracy: 0.836298942565918
elapsed time: 1.0040242671966553
Test loss: 0.01113727
Test accuracy: 83.63%

Model 4: resnet50 transfer learning

learning_rate = 0.01
log_step = 20

model = ModelTransfer()
model = model.cuda()

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=0.9)

num_epochs = 20
best_val_acc = 0
best_epoch = 0

history = []
accuracy = []
for epoch in range(num_epochs):
    adjust_learning_rate(optimizer, epoch)
    train_loss, train_acc = train()
    val_loss, val_acc = validate()
    history.append((train_loss, val_loss))
    accuracy.append((train_acc, val_acc))

    if val_acc > best_val_acc:
        print("[Info] best validation accuracy!")
        best_val_acc = val_acc
        best_epoch = epoch
        torch.save(model.state_dict(), f"best_checkpoint_epoch_{epoch + 1}.pth")

torch.save(model.state_dict(), f"last_checkpoint_epoch_{num_epochs}.pth")

plt.plot([x[0] for x in accuracy], 'b', label='train')
plt.plot([x[1] for x in accuracy], 'r--',label='validation')
plt.xlabel("Epochs")
plt.ylabel("Accuracy")
plt.legend()

test_loss, test_accuracy = test()
print(f"Test loss: {test_loss:.8f}")
print(f"Test accuracy: {test_accuracy * 100.:.2f}%")

Output

[Epoch: 1 - Training]
train loss: 0.011269845013265256, accuracy: 0.742222249507904
elapsed time: 13.960606813430786
[Epoch: 1 - Validation]
[Batch: 1] running val loss: 0.0027156963478773832, running val accuracy: 0.921875
val loss: 0.004075330301854738, accuracy: 0.9349112510681152
elapsed time: 1.6107542514801025
[Info] best validation accuracy!

...

[Epoch: 20 - Training]
train loss: 9.46323561516625e-05, accuracy: 1.0
elapsed time: 13.069997072219849
[Epoch: 20 - Validation]
[Batch: 1] running val loss: 0.00017253367695957422, running val accuracy: 1.0
val loss: 0.001278820651522755, accuracy: 0.9704142212867737
elapsed time: 1.609588861465454
[Test]
[Batch: 1] running test loss: 0.0005804044776596129, running test accuracy: 0.984375
test loss: 0.0012308261295104418, accuracy: 0.982206404209137
elapsed time: 1.8220326900482178
Test loss: 0.00123083
Test accuracy: 98.22%

profile
๋งŒ๋‘๋Š” ๋ชฉ๋ง๋ผ

0๊ฐœ์˜ ๋Œ“๊ธ€