[PyTorch] MNIST data set MLP&CNN으로 모델 구현

최원석·2026년 2월 19일
post-thumbnail

MLP (Multi-Layer Percenptron)

여러 층의 노드들이 연결되어 있어 전층결합(Fully Connected Layer) 라고 부른다. 입력층 - 은닉층 - 출력층 구조를 이루고 있으며 정형 데이터를 처리할 때 아주 효과적이다.

MLP로 모델 구현

import numpy as np
import matplotlib.pyplot as plt

import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
BATCH_SIZE = 32
EPOCHS = 10

train_dataset = datasets.MNIST(root = "../data/MNIST", train = True, download=True, transform = transforms.ToTensor())
test_dataset = datasets.MNIST(root = "../data/MNIST", train = False, download=True, transform = transforms.ToTensor())

train_loader = DataLoader(train_dataset, batch_size = BATCH_SIZE, shuffle = True)
test_loader = DataLoader(test_dataset,batch_size = BATCH_SIZE, shuffle = False) 

train_dataset, test_dataset 에 datasets으로 MNIST 데이터 셋을 불러오고 있다. root 경로에 가져온다. train 을 True, False로 명시해서 사용할 수 있다. transforms.Totensor() 을 통해 데이터 셋을 Tensor로 변환시켰다.

DataLoader 를 통해 batch_size, shuffle 여부를 결정할 수 있다. ( test set의 경우에는 섞지 않는다.)

for(X_train, y_train) in train_loader:
  print(X_train.shape)
  print(y_train.shape)
  break

pltsize = 1
plt.figure(figsize=(10 * pltsize, pltsize)) #10개 plot하기 위한 figure 크기 설정

for i in range(10):
    plt.subplot(1, 10, i + 1) # plot.subplot(rows, columns, index)
    plt.axis('off')
    plt.imshow(X_train[i, :, :, :].numpy().reshape(28, 28), cmap = "gray_r")
    plt.title('Class: ' + str(y_train[i].item()))
class MLP(nn.Module):
  def __init__(self):
    super(MLP, self).__init__()
    self.fc1 = nn.Linear(28*28, 512)
    self.fc2 = nn.Linear(512, 256)
    self.fc3 = nn.Linear(256, 10)

  def forward(self, x):
    x = x.view(-1, 28*28)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    x = F.log_softmax(x, dim=1)
    return x

모델 클래스를 정의 했다. view()를 통해 1차원으로 다시 나열했다.

이후 Linear층 → relu 함수를 반복하며 특징을 뽑아냈다.

relu 함수는 0이하의 값은 모두 0으로 0보다 큰 값은 그대로 쓰는 함수이다.

마지막으로 log_softmax를 통해 fc3에서 나온 결과값을 계산해 합계를 100%(0.1)로 만드는 과정이다.

log_softmax([input], [dim])
dim=1은 가로방향(행)으로 계산, dim=0은 세로방향(열)로 계산

model = MLP().to(device=torch.device("cpu"))
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
criterion = nn.CrossEntropyLoss()

CrossEntropyLoss() 기존의 MESLoss() 과 같은 역할을 하지만 모델이 분류하고자 하는 목적이 다르기 때문에 CrossEntropyLoss() 가 더 적합하다.

def train (model, train_loader, optimizer, log_interval):
  model.train()
  for batch_idx, (image, label) in enumerate(train_loader):
    image = image.to(device=torch.device("cpu"))
    label = label.to(device=torch.device("cpu"))  
    optimizer.zero_grad()
    output = model(image)
    loss = criterion(output, label)
    loss.backward()
    optimizer.step()

    if batch_idx % log_interval == 0:
      print("Train epoch: {} [{}/{} ({:.0f}%)]\tTrain Loss: {:.6f}".format(
                epoch, batch_idx * len(image),
                len(train_loader.dataset), 100. * batch_idx / len(train_loader),
                loss.item()))

모델을 학습 시키는 과정이다. 인자로 모델과 학습 데이터, optimizer, 로그를 띄우는 인자를 받고 있다. 위에서 만든 criterion()함수를 이용해 output과 label 을 넣고 loss를 구하고 있다.

def evaluate(model, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    
    with torch.no_grad():
        for image, label in test_loader:
            image = image.to(torch.device("cpu"))
            label = label.to(torch.device("cpu"))
            output = model(image)
            test_loss += criterion(output, label).item()
            prediction = output.max(1, keepdim=True)[1]
            correct += prediction.eq(label.view_as(prediction)).sum().item()
    
    test_loss /= (len(test_loader.dataset)/ BATCH_SIZE)
    test_accuracy = 100. * correct/len(test_loader.dataset)
    return test_loss, test_accuracy

모델이 잘 학습 되었는지 확인하기 위한 검증함수 이다.

for epoch in range(1, EPOCHS + 1):
  train(model, train_loader, optimizer, log_interval=200)
  test_loss, test_accuracy = evaluate(model, test_loader)
  print("\n[EPOCH: {}], \tTest Loss: {:.4f}, \tTest Accuracy: {:.2f} %\n".format(
        epoch, test_loss, test_accuracy))

실제 코드를 돌리는 부분이다.

출력결과

Train epoch: 1 [0/60000 (0%)]	Train Loss: 2.295880
Train epoch: 1 [6400/60000 (11%)]	Train Loss: 1.856210
Train epoch: 1 [12800/60000 (21%)]	Train Loss: 0.766492
Train epoch: 1 [19200/60000 (32%)]	Train Loss: 0.451512
Train epoch: 1 [25600/60000 (43%)]	Train Loss: 0.649790
Train epoch: 1 [32000/60000 (53%)]	Train Loss: 0.578205
Train epoch: 1 [38400/60000 (64%)]	Train Loss: 0.283212
Train epoch: 1 [44800/60000 (75%)]	Train Loss: 0.336932
Train epoch: 1 [51200/60000 (85%)]	Train Loss: 0.262401
Train epoch: 1 [57600/60000 (96%)]	Train Loss: 0.398156

[EPOCH: 1], 	Test Loss: 0.3243, 	Test Accuracy: 90.29 %

Train epoch: 2 [0/60000 (0%)]	Train Loss: 0.354413
Train epoch: 2 [6400/60000 (11%)]	Train Loss: 0.369770
Train epoch: 2 [12800/60000 (21%)]	Train Loss: 0.291158
Train epoch: 2 [19200/60000 (32%)]	Train Loss: 0.519158
Train epoch: 2 [25600/60000 (43%)]	Train Loss: 0.156046
Train epoch: 2 [32000/60000 (53%)]	Train Loss: 0.158155
Train epoch: 2 [38400/60000 (64%)]	Train Loss: 0.243987
Train epoch: 2 [44800/60000 (75%)]	Train Loss: 0.248704
Train epoch: 2 [51200/60000 (85%)]	Train Loss: 0.162071
Train epoch: 2 [57600/60000 (96%)]	Train Loss: 0.095227

[EPOCH: 2], 	Test Loss: 0.2261, 	Test Accuracy: 93.38 %

Train epoch: 3 [0/60000 (0%)]	Train Loss: 0.106720
Train epoch: 3 [6400/60000 (11%)]	Train Loss: 0.385983
Train epoch: 3 [12800/60000 (21%)]	Train Loss: 0.193502
Train epoch: 3 [19200/60000 (32%)]	Train Loss: 0.542263
Train epoch: 3 [25600/60000 (43%)]	Train Loss: 0.047341
Train epoch: 3 [32000/60000 (53%)]	Train Loss: 0.072650
Train epoch: 3 [38400/60000 (64%)]	Train Loss: 0.396035
Train epoch: 3 [44800/60000 (75%)]	Train Loss: 0.041442
Train epoch: 3 [51200/60000 (85%)]	Train Loss: 0.278256
Train epoch: 3 [57600/60000 (96%)]	Train Loss: 0.334667

[EPOCH: 3], 	Test Loss: 0.1803, 	Test Accuracy: 94.75 %

Train epoch: 4 [0/60000 (0%)]	Train Loss: 0.094163
Train epoch: 4 [6400/60000 (11%)]	Train Loss: 0.351428
Train epoch: 4 [12800/60000 (21%)]	Train Loss: 0.372503
Train epoch: 4 [19200/60000 (32%)]	Train Loss: 0.278116
Train epoch: 4 [25600/60000 (43%)]	Train Loss: 0.112307
Train epoch: 4 [32000/60000 (53%)]	Train Loss: 0.087376
Train epoch: 4 [38400/60000 (64%)]	Train Loss: 0.073601
Train epoch: 4 [44800/60000 (75%)]	Train Loss: 0.319843
Train epoch: 4 [51200/60000 (85%)]	Train Loss: 0.114105
Train epoch: 4 [57600/60000 (96%)]	Train Loss: 0.057666

[EPOCH: 4], 	Test Loss: 0.1575, 	Test Accuracy: 95.36 %

Train epoch: 5 [0/60000 (0%)]	Train Loss: 0.132325
Train epoch: 5 [6400/60000 (11%)]	Train Loss: 0.175407
Train epoch: 5 [12800/60000 (21%)]	Train Loss: 0.221547
Train epoch: 5 [19200/60000 (32%)]	Train Loss: 0.056232
Train epoch: 5 [25600/60000 (43%)]	Train Loss: 0.489672
Train epoch: 5 [32000/60000 (53%)]	Train Loss: 0.482867
Train epoch: 5 [38400/60000 (64%)]	Train Loss: 0.065646
Train epoch: 5 [44800/60000 (75%)]	Train Loss: 0.091367
Train epoch: 5 [51200/60000 (85%)]	Train Loss: 0.063387
Train epoch: 5 [57600/60000 (96%)]	Train Loss: 0.184671

[EPOCH: 5], 	Test Loss: 0.1306, 	Test Accuracy: 96.05 %

Train epoch: 6 [0/60000 (0%)]	Train Loss: 0.016412
Train epoch: 6 [6400/60000 (11%)]	Train Loss: 0.174848
Train epoch: 6 [12800/60000 (21%)]	Train Loss: 0.043638
Train epoch: 6 [19200/60000 (32%)]	Train Loss: 0.147906
Train epoch: 6 [25600/60000 (43%)]	Train Loss: 0.169951
Train epoch: 6 [32000/60000 (53%)]	Train Loss: 0.175466
Train epoch: 6 [38400/60000 (64%)]	Train Loss: 0.065988
Train epoch: 6 [44800/60000 (75%)]	Train Loss: 0.097291
Train epoch: 6 [51200/60000 (85%)]	Train Loss: 0.031171
Train epoch: 6 [57600/60000 (96%)]	Train Loss: 0.107508

[EPOCH: 6], 	Test Loss: 0.1180, 	Test Accuracy: 96.46 %

Train epoch: 7 [0/60000 (0%)]	Train Loss: 0.023183
Train epoch: 7 [6400/60000 (11%)]	Train Loss: 0.041281
Train epoch: 7 [12800/60000 (21%)]	Train Loss: 0.075828
Train epoch: 7 [19200/60000 (32%)]	Train Loss: 0.052879
Train epoch: 7 [25600/60000 (43%)]	Train Loss: 0.021749
Train epoch: 7 [32000/60000 (53%)]	Train Loss: 0.198561
Train epoch: 7 [38400/60000 (64%)]	Train Loss: 0.050543
Train epoch: 7 [44800/60000 (75%)]	Train Loss: 0.075291
Train epoch: 7 [51200/60000 (85%)]	Train Loss: 0.125982
Train epoch: 7 [57600/60000 (96%)]	Train Loss: 0.064391

[EPOCH: 7], 	Test Loss: 0.1011, 	Test Accuracy: 96.92 %

Train epoch: 8 [0/60000 (0%)]	Train Loss: 0.016860
Train epoch: 8 [6400/60000 (11%)]	Train Loss: 0.117673
Train epoch: 8 [12800/60000 (21%)]	Train Loss: 0.019247
Train epoch: 8 [19200/60000 (32%)]	Train Loss: 0.225010
Train epoch: 8 [25600/60000 (43%)]	Train Loss: 0.073858
Train epoch: 8 [32000/60000 (53%)]	Train Loss: 0.155419
Train epoch: 8 [38400/60000 (64%)]	Train Loss: 0.107518
Train epoch: 8 [44800/60000 (75%)]	Train Loss: 0.042732
Train epoch: 8 [51200/60000 (85%)]	Train Loss: 0.402344
Train epoch: 8 [57600/60000 (96%)]	Train Loss: 0.028305

[EPOCH: 8], 	Test Loss: 0.0939, 	Test Accuracy: 97.07 %

Train epoch: 9 [0/60000 (0%)]	Train Loss: 0.126242
Train epoch: 9 [6400/60000 (11%)]	Train Loss: 0.168472
Train epoch: 9 [12800/60000 (21%)]	Train Loss: 0.033166
Train epoch: 9 [19200/60000 (32%)]	Train Loss: 0.030906
Train epoch: 9 [25600/60000 (43%)]	Train Loss: 0.017684
Train epoch: 9 [32000/60000 (53%)]	Train Loss: 0.095147
Train epoch: 9 [38400/60000 (64%)]	Train Loss: 0.020783
Train epoch: 9 [44800/60000 (75%)]	Train Loss: 0.146282
Train epoch: 9 [51200/60000 (85%)]	Train Loss: 0.047931
Train epoch: 9 [57600/60000 (96%)]	Train Loss: 0.105028

[EPOCH: 9], 	Test Loss: 0.0915, 	Test Accuracy: 97.25 %

Train epoch: 10 [0/60000 (0%)]	Train Loss: 0.027292
Train epoch: 10 [6400/60000 (11%)]	Train Loss: 0.144544
Train epoch: 10 [12800/60000 (21%)]	Train Loss: 0.009972
Train epoch: 10 [19200/60000 (32%)]	Train Loss: 0.008092
Train epoch: 10 [25600/60000 (43%)]	Train Loss: 0.026988
Train epoch: 10 [32000/60000 (53%)]	Train Loss: 0.019556
Train epoch: 10 [38400/60000 (64%)]	Train Loss: 0.048808
Train epoch: 10 [44800/60000 (75%)]	Train Loss: 0.107426
Train epoch: 10 [51200/60000 (85%)]	Train Loss: 0.011870
Train epoch: 10 [57600/60000 (96%)]	Train Loss: 0.113121

[EPOCH: 10], 	Test Loss: 0.0856, 	Test Accuracy: 97.40 %

CNN ( Convolutional Neural Network )

합성 곱 신경망으로 Convolution Layer와 Pooling Layer 층을 가지고 구성된다.

CNN 모델 구현

import torch
from torch import nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
BATCH_SIZE = 32
EPOCHS = 10

train_dataset = datasets.MNIST(root = "../data/MNIST", train =True, download=True, transform = transforms.ToTensor())
test_dataset = datasets.MNIST(root = "../data/MNIST", train = False, download=True, transform = transforms.ToTensor())

train_loader = DataLoader(train_dataset, batch_size = BATCH_SIZE, shuffle = True)
test_loader = DataLoader(test_dataset,batch_size = BATCH_SIZE, shuffle = False) 
from torch.nn.modules import ReLU
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.keep_prob=0.5

        self.layer1 = torch.nn.Sequential(
            torch.nn.Conv2d(1,32,kernel_size=3,stride=1,padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2,stride=2)
        )
        self.layer2 = torch.nn.Sequential(
            torch.nn.Conv2d(32,64,kernel_size=3,stride=1,padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2,stride=2)
        )
        self.layer3 = torch.nn.Sequential(
            torch.nn.Conv2d(64,128,kernel_size=3,stride=1,padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2,stride=2)
        )

       
        self.fc1 = torch.nn.Linear(128 * 3 * 3, 625, bias=True)
        torch.nn.init.xavier_uniform_(self.fc1.weight) # 가중치 초기화

        self.layer4 = torch.nn.Sequential(
            self.fc1,
            torch.nn.ReLU(),
            torch.nn.Dropout(p=(1-self.keep_prob))
        )

        self.fc2 = torch.nn.Linear(625, 10, bias=True)
        torch.nn.init.xavier_uniform_(self.fc2.weight) # 가중치 초기화

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = self.layer3(out)
        out = out.view(out.size(0), -1)
        out = self.layer4(out)
        out = self.fc2(out)
        return out

Sequential() 를 통해 게층을 하나로 묶었다. 이전 코드에서는 forward 과정에서 각각 호출에서 작성했지만 Sequential() 을 통해 코드르 간결화 시켰다. 전체적인 코드는 con층 3개 fc 층 2개로 이루어져있다. 가중치의 값이 너무 크거나 너무 작으면 원하는 값을 구하기 어려줘 진다. torch.nn.init.xavier_uniform_ 는 입력 노드와 출력 노드의 수를 고려하여 가중치값이 너무 치우치지 않도록 가중치 값을 초기화 시켜주는 역할을 가지고 있다.

[ 몇가지의 의문점 ]
Conv2d의 함수에서 padding은 왜 사용하는 걸까? , Conv2d에서 출력 채널을 정하는 기준이 있을까?

padding은 들어온 이미지의 테두리에 가상의 픽셀을 추가하는 동작을 수행한다. 필터로 이미지를 보다보면 중앙 부분보다 가장자리 테두리 부분의 이미지 픽셀이 훨씬 적게 계산된다. (그만큼 중요도와 특징을 뽑아내지 못함) 이것을 완화하기 위해 padding을 사용해 방지한다.

직접적인 계산이 있는 것이 아닌 하이퍼파라미터이다. 현재 층 출력 채널과 다음 층 입력 채널을 맞춰주면 큰 문제는 없지만, 관습적으로 2의 제곱에 해당하는 숫자로 한다.

model = CNN().to(DEVICE)

criterion = nn.CrossEntropyLoss().to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr = 0.001)

optim 에서 SGD가 아닌 Adam을 사용했다. Adam은 SGD + Momentum + RMSProp의 장점을 합쳐서 사용하는 방식이다. 함정에 빠질 확률을 낮출 수 있다. 하지만 더욱 정교한 교정이 필요할 때에는 SGD로 사용하는 경우가 있다.

def train(model, train_loader, optimizer, log_interval):
    model.train()

    for batch_idx, (image, label) in enumerate(train_loader):
      image = image.to(DEVICE)
      label = label.to(DEVICE)

      optimizer.zero_grad()
      output = model(image)
      loss = criterion(output, label)
      loss.backward()
      optimizer.step()

      if batch_idx % log_interval == 0:
        print("Train epoch: {} [{}/{} ({:.0f}%)]\tTrain Loss: {:.6f}".format(
                epoch, batch_idx * len(image),
                len(train_loader.dataset), 100. * batch_idx / len(train_loader),
                loss.item()))

모델을 학습 시키는 함수이다.

def evaluation(model, test_loader):
    model.eval()
    test_loss = 0
    correct = 0

    with torch.no_grad():
        for image, label in test_loader:
            image = image.to(DEVICE)
            label = label.to(DEVICE)

            output = model(image)
            test_loss += criterion(output, label).item()
            prediction = output.max(1, keepdim=True)[1]
            correct += prediction.eq(label.view_as(prediction)).sum().item()
    
    test_loss /= (len(test_loader.dataset)/ BATCH_SIZE)
    test_accuracy = 100. * correct/len(test_loader.dataset)
    return test_loss, test_accuracy

모델이 얼마나 학습했는지 test data set을 이용해 검증하는 함수이다.

for epoch in range(1, EPOCHS + 1):
    train(model, train_loader, optimizer, log_interval=200)
    test_loss, test_accuracy = evaluation(model, test_loader)
    print("\n[EPOCH: {}], \tTest Loss: {:.4f}, \tTest Accuracy: {:.2f} %\n".format(
        epoch, test_loss, test_accuracy))

출력결과

Train epoch: 1 [0/60000 (0%)]	Train Loss: 2.302500
Train epoch: 1 [6400/60000 (11%)]	Train Loss: 0.020710
Train epoch: 1 [12800/60000 (21%)]	Train Loss: 0.072938
Train epoch: 1 [19200/60000 (32%)]	Train Loss: 0.070585
Train epoch: 1 [25600/60000 (43%)]	Train Loss: 0.010864
Train epoch: 1 [32000/60000 (53%)]	Train Loss: 0.164177
Train epoch: 1 [38400/60000 (64%)]	Train Loss: 0.000477
Train epoch: 1 [44800/60000 (75%)]	Train Loss: 0.026269
Train epoch: 1 [51200/60000 (85%)]	Train Loss: 0.029791
Train epoch: 1 [57600/60000 (96%)]	Train Loss: 0.074889

[EPOCH: 1], 	Test Loss: 0.0415, 	Test Accuracy: 98.69 %

Train epoch: 2 [0/60000 (0%)]	Train Loss: 0.090489
Train epoch: 2 [6400/60000 (11%)]	Train Loss: 0.006572
Train epoch: 2 [12800/60000 (21%)]	Train Loss: 0.157872
Train epoch: 2 [19200/60000 (32%)]	Train Loss: 0.060257
Train epoch: 2 [25600/60000 (43%)]	Train Loss: 0.000952
Train epoch: 2 [32000/60000 (53%)]	Train Loss: 0.003554
Train epoch: 2 [38400/60000 (64%)]	Train Loss: 0.009752
Train epoch: 2 [44800/60000 (75%)]	Train Loss: 0.040368
Train epoch: 2 [51200/60000 (85%)]	Train Loss: 0.006694
Train epoch: 2 [57600/60000 (96%)]	Train Loss: 0.250820

[EPOCH: 2], 	Test Loss: 0.0262, 	Test Accuracy: 99.14 %

Train epoch: 3 [0/60000 (0%)]	Train Loss: 0.001073
Train epoch: 3 [6400/60000 (11%)]	Train Loss: 0.009513
Train epoch: 3 [12800/60000 (21%)]	Train Loss: 0.082231
Train epoch: 3 [19200/60000 (32%)]	Train Loss: 0.001920
Train epoch: 3 [25600/60000 (43%)]	Train Loss: 0.055383
Train epoch: 3 [32000/60000 (53%)]	Train Loss: 0.093025
Train epoch: 3 [38400/60000 (64%)]	Train Loss: 0.006312
Train epoch: 3 [44800/60000 (75%)]	Train Loss: 0.160864
Train epoch: 3 [51200/60000 (85%)]	Train Loss: 0.031659
Train epoch: 3 [57600/60000 (96%)]	Train Loss: 0.002794

[EPOCH: 3], 	Test Loss: 0.0287, 	Test Accuracy: 99.06 %

Train epoch: 4 [0/60000 (0%)]	Train Loss: 0.032770
Train epoch: 4 [6400/60000 (11%)]	Train Loss: 0.004879
Train epoch: 4 [12800/60000 (21%)]	Train Loss: 0.074115
Train epoch: 4 [19200/60000 (32%)]	Train Loss: 0.000641
Train epoch: 4 [25600/60000 (43%)]	Train Loss: 0.000061
Train epoch: 4 [32000/60000 (53%)]	Train Loss: 0.000526
Train epoch: 4 [38400/60000 (64%)]	Train Loss: 0.002731
Train epoch: 4 [44800/60000 (75%)]	Train Loss: 0.000543
Train epoch: 4 [51200/60000 (85%)]	Train Loss: 0.070365
Train epoch: 4 [57600/60000 (96%)]	Train Loss: 0.003590

[EPOCH: 4], 	Test Loss: 0.0275, 	Test Accuracy: 99.18 %

Train epoch: 5 [0/60000 (0%)]	Train Loss: 0.013684
Train epoch: 5 [6400/60000 (11%)]	Train Loss: 0.000570
Train epoch: 5 [12800/60000 (21%)]	Train Loss: 0.057022
Train epoch: 5 [19200/60000 (32%)]	Train Loss: 0.038811
Train epoch: 5 [25600/60000 (43%)]	Train Loss: 0.000251
Train epoch: 5 [32000/60000 (53%)]	Train Loss: 0.003567
Train epoch: 5 [38400/60000 (64%)]	Train Loss: 0.160953
Train epoch: 5 [44800/60000 (75%)]	Train Loss: 0.021606
Train epoch: 5 [51200/60000 (85%)]	Train Loss: 0.000387
Train epoch: 5 [57600/60000 (96%)]	Train Loss: 0.000611

[EPOCH: 5], 	Test Loss: 0.0267, 	Test Accuracy: 99.25 %

Train epoch: 6 [0/60000 (0%)]	Train Loss: 0.001599
Train epoch: 6 [6400/60000 (11%)]	Train Loss: 0.052861
Train epoch: 6 [12800/60000 (21%)]	Train Loss: 0.014980
Train epoch: 6 [19200/60000 (32%)]	Train Loss: 0.000985
Train epoch: 6 [25600/60000 (43%)]	Train Loss: 0.000812
Train epoch: 6 [32000/60000 (53%)]	Train Loss: 0.002275
Train epoch: 6 [38400/60000 (64%)]	Train Loss: 0.007388
Train epoch: 6 [44800/60000 (75%)]	Train Loss: 0.043215
Train epoch: 6 [51200/60000 (85%)]	Train Loss: 0.000217
Train epoch: 6 [57600/60000 (96%)]	Train Loss: 0.001535

[EPOCH: 6], 	Test Loss: 0.0283, 	Test Accuracy: 99.15 %

Train epoch: 7 [0/60000 (0%)]	Train Loss: 0.000100
Train epoch: 7 [6400/60000 (11%)]	Train Loss: 0.000396
Train epoch: 7 [12800/60000 (21%)]	Train Loss: 0.000071
Train epoch: 7 [19200/60000 (32%)]	Train Loss: 0.000645
Train epoch: 7 [25600/60000 (43%)]	Train Loss: 0.065544
Train epoch: 7 [32000/60000 (53%)]	Train Loss: 0.004502
Train epoch: 7 [38400/60000 (64%)]	Train Loss: 0.000171
Train epoch: 7 [44800/60000 (75%)]	Train Loss: 0.000013
Train epoch: 7 [51200/60000 (85%)]	Train Loss: 0.004819
Train epoch: 7 [57600/60000 (96%)]	Train Loss: 0.014584

[EPOCH: 7], 	Test Loss: 0.0280, 	Test Accuracy: 99.17 %

Train epoch: 8 [0/60000 (0%)]	Train Loss: 0.024070
Train epoch: 8 [6400/60000 (11%)]	Train Loss: 0.000015
Train epoch: 8 [12800/60000 (21%)]	Train Loss: 0.000056
Train epoch: 8 [19200/60000 (32%)]	Train Loss: 0.128571
Train epoch: 8 [25600/60000 (43%)]	Train Loss: 0.000719
Train epoch: 8 [32000/60000 (53%)]	Train Loss: 0.000371
Train epoch: 8 [38400/60000 (64%)]	Train Loss: 0.000062
Train epoch: 8 [44800/60000 (75%)]	Train Loss: 0.005178
Train epoch: 8 [51200/60000 (85%)]	Train Loss: 0.000601
Train epoch: 8 [57600/60000 (96%)]	Train Loss: 0.002863

[EPOCH: 8], 	Test Loss: 0.0374, 	Test Accuracy: 99.18 %

Train epoch: 9 [0/60000 (0%)]	Train Loss: 0.048398
Train epoch: 9 [6400/60000 (11%)]	Train Loss: 0.008198
Train epoch: 9 [12800/60000 (21%)]	Train Loss: 0.029910
Train epoch: 9 [19200/60000 (32%)]	Train Loss: 0.002376
Train epoch: 9 [25600/60000 (43%)]	Train Loss: 0.000050
Train epoch: 9 [32000/60000 (53%)]	Train Loss: 0.000209
Train epoch: 9 [38400/60000 (64%)]	Train Loss: 0.001148
Train epoch: 9 [44800/60000 (75%)]	Train Loss: 0.000002
Train epoch: 9 [51200/60000 (85%)]	Train Loss: 0.000002
Train epoch: 9 [57600/60000 (96%)]	Train Loss: 0.000423

[EPOCH: 9], 	Test Loss: 0.0272, 	Test Accuracy: 99.30 %

Train epoch: 10 [0/60000 (0%)]	Train Loss: 0.002476
Train epoch: 10 [6400/60000 (11%)]	Train Loss: 0.000269
Train epoch: 10 [12800/60000 (21%)]	Train Loss: 0.000115
Train epoch: 10 [19200/60000 (32%)]	Train Loss: 0.001352
Train epoch: 10 [25600/60000 (43%)]	Train Loss: 0.326977
Train epoch: 10 [32000/60000 (53%)]	Train Loss: 0.000599
Train epoch: 10 [38400/60000 (64%)]	Train Loss: 0.000920
Train epoch: 10 [44800/60000 (75%)]	Train Loss: 0.000658
Train epoch: 10 [51200/60000 (85%)]	Train Loss: 0.027969
Train epoch: 10 [57600/60000 (96%)]	Train Loss: 0.000011

[EPOCH: 10], 	Test Loss: 0.0288, 	Test Accuracy: 99.34 %

마지막….

이제 MLP CNN에서 사용되는 Linear, Conv2d가 익숙해지고 작동 원리가 이해되기 시작하는 단계인 것 같다. 코드 구현은 아직 익숙하지 않기에 완벽하지는 않다.(많이 쳐보면 될 문제이다.) 하지만, 코드 구조를 짜는 것은 아직 감이 안잡힌다. 몇개의 층을 사용해야 더 좋은 정확도가 도출될 것인가. 한번 고민하고 더 공부해봐야할 부분인 것 같다.

0개의 댓글