cs231n 과제 2 Q5- PyTorch

이준학·2024년 7월 17일

cs231n 과제

목록 보기

10/15

과제 2의 마지막은 pytorch를 이용해 모델을 구현하는 것이다. colab이 제공하는 GPU를 이용한다. 지금까지는 numpy를 이용해서 모델을 구현했었다. 이제 기본적인 개념에 대해서 구현까지 해 가면서 이해했기 때문에, pytorch가 제공하는 함수들을 이용해서 모델을 설계해보자. 이 과제에서 나오는 pytorch 코드 이외의 내용은 따로 블로그에 정리하려고 한다.

1. Barebones PyTorch

과제에서는 크게 3가지의 방법으로 모델을 구현한다. 첫 번째 방법이 가장 낮은 level의 barebones pytorch를 사용하는 방법이다. 이후에 나올 방법은 nn.Module, nn.Sequential을 이용하는 방법들이다. 우선 barebones 방법부터 한 번 살펴보자. 과제에서는 2-layer net에 대한 구현도 하지만, 나는 three-layer convnet에 대해서만 다루려고 한다.

def three_layer_convnet(x, params):
   
    conv_w1, conv_b1, conv_w2, conv_b2, fc_w, fc_b = params
    scores = None
    ################################################################################
    # TODO: Implement the forward pass for the three-layer ConvNet.                #
    ################################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    x=F.relu(F.conv2d(x,conv_w1,conv_b1,padding=2))
    x=F.relu(F.conv2d(x,conv_w2,conv_b2,padding=1))
    scores=flatten(x).mm(fc_w)+fc_b


    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ################################################################################
    #                                 END OF YOUR CODE                             #
    ################################################################################
    return scores

여기서 구현한 architecture는 CONV-ReLU-CONV-ReLU-FC 의 구조를 가진다. Conv layer에서 나온 output을 FC layer에 넣을 때는 flattening 과정이 필요하다. input의 구조가 다르기 때문이다. 또, 여기서는 nn.functional.conv2d를 사용했는데, 내가 헷갈렸던 것은 nn.Conv2d와 nn.functional.conv2d의 차이이다. 알아보니 nn.Conv2d가 더 high level에서 쓰였다고 한다. 여기서는 low level로 구현을 하기 때문에, nn.functional을 사용한다. 이전에 구현하던 내용이라, 그닥 어려울 것은 없었다. PyTorch에서 지원하는 함수가 무엇인지만 알면 된다.
barebone이기 때문에, train을 할 때도 update 과정을 직접 구현해야 한다.

def train_part2(model_fn, params, learning_rate):
    for t, (x, y) in enumerate(loader_train):
        # Move the data to the proper device (GPU or CPU)
        x = x.to(device=device, dtype=dtype)
        y = y.to(device=device, dtype=torch.long)

        # Forward pass: compute scores and loss
        scores = model_fn(x, params)
        loss = F.cross_entropy(scores, y)

        # Backward pass: PyTorch figures out which Tensors in the computational
        # graph has requires_grad=True and uses backpropagation to compute the
        # gradient of the loss with respect to these Tensors, and stores the
        # gradients in the .grad attribute of each Tensor.
        loss.backward()

        # Update parameters. We don't want to backpropagate through the
        # parameter updates, so we scope the updates under a torch.no_grad()
        # context manager to prevent a computational graph from being built.
        with torch.no_grad():
            for w in params:
                w -= learning_rate * w.grad

                # Manually zero the gradients after running the backward pass
                w.grad.zero_()

        if t % print_every == 0:
            print('Iteration %d, loss = %.4f' % (t, loss.item()))
            check_accuracy_part2(loader_val, model_fn, params)
            print()

이렇게 함수를 만들어 놓고, weight들과 bias들만 초기화 시켜주고, 차원을 맞춰서 train_part2() 함수에 전달해주면 된다.

2. nn.Module

이제는 nn.Module API를 이용해서 조금 더 간단하게 똑같은 three-layer convnet을 설계한다. network의 architecture 또한 앞서 구현한 것과 동일하다.

class ThreeLayerConvNet(nn.Module):
    def __init__(self, in_channel, channel_1, channel_2, num_classes):
        super().__init__()
        ########################################################################
        # TODO: Set up the layers you need for a three-layer ConvNet with the  #
        # architecture defined above.                                          #
        ########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        self.conv1=nn.Conv2d(in_channel,channel_1,kernel_size=(5,5),padding=2)
        nn.init.kaiming_normal_(self.conv1.weight)
        self.conv2=nn.Conv2d(channel_1,channel_2,kernel_size=(3,3),padding=1)
        nn.init.kaiming_normal_(self.conv2.weight)
        self.fc1=nn.Linear(channel_2*32*32,num_classes)
        nn.init.kaiming_normal_(self.fc1.weight)

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ########################################################################
        #                          END OF YOUR CODE                            #
        ########################################################################

    def forward(self, x):
        scores = None
        ########################################################################
        # TODO: Implement the forward function for a 3-layer ConvNet. you      #
        # should use the layers you defined in __init__ and specify the        #
        # connectivity of those layers in forward()                            #
        ########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        x=F.relu(self.conv1(x))
        x=F.relu(self.conv2(x))
        scores=self.fc1(flatten(x))


        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ########################################################################
        #                             END OF YOUR CODE                         #
        ########################################################################
        return scores


def test_ThreeLayerConvNet():
    x = torch.zeros((64, 3, 32, 32), dtype=dtype)  # minibatch size 64, image size [3, 32, 32]
    model = ThreeLayerConvNet(in_channel=3, channel_1=12, channel_2=8, num_classes=10)
    scores = model(x)
    print(scores.size())  # you should see [64, 10]
test_ThreeLayerConvNet()

주목할 만한 점은, init() 함수에서 필요한 layer들을 다 구현해놓고, forward 함수를 통해 relu를 적용해준다.

def train_part34(model, optimizer, epochs=1):
    """
    Train a model on CIFAR-10 using the PyTorch Module API.

    Inputs:
    - model: A PyTorch Module giving the model to train.
    - optimizer: An Optimizer object we will use to train the model
    - epochs: (Optional) A Python integer giving the number of epochs to train for

    Returns: Nothing, but prints model accuracies during training.
    """
    model = model.to(device=device)  # move the model parameters to CPU/GPU
    for e in range(epochs):
        for t, (x, y) in enumerate(loader_train):
            model.train()  # put model to training mode
            x = x.to(device=device, dtype=dtype)  # move to device, e.g. GPU
            y = y.to(device=device, dtype=torch.long)

            scores = model(x)
            loss = F.cross_entropy(scores, y)

            # Zero out all of the gradients for the variables which the optimizer
            # will update.
            optimizer.zero_grad()

            # This is the backwards pass: compute the gradient of the loss with
            # respect to each  parameter of the model.
            loss.backward()

            # Actually update the parameters of the model using the gradients
            # computed by the backwards pass.
            optimizer.step()

            if t % print_every == 0:
                print('Iteration %d, loss = %.4f' % (t, loss.item()))
                check_accuracy_part34(loader_val, model)
                print()

barebone과 마찬가지로 train 과정을 다루는 함수도 적어준다. 다만 optimizer를 사용하기 때문에, optimizer.zero_grad(), loss.backward, optim.step() 과 같은 함수를 통해 backward pass를 조금 더 간단하게 구현할 수 있다. optimizer는 우리가 지금까지 배웠던 adam, SGD 등 다양한 모드 설정이 가능하다.

learning_rate = 3e-3
channel_1 = 32
channel_2 = 16

model = None
optimizer = None
################################################################################
# TODO: Instantiate your ThreeLayerConvNet model and a corresponding optimizer #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

model=ThreeLayerConvNet(3,channel_1,channel_2,10)
optimizer=optim.SGD(model.parameters(), lr=learning_rate)

train_part34(model, optimizer)
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

train_part34(model, optimizer)

이렇게 함수를 호출해서 사용하는 것이다. barebone보다 더 간단한 방법으로 모델을 훈련시킬 수 있었다.

3. nn.Sequential

이번엔 nn.Sequential을 이용해서 모델을 설계해보자. nn.Module보다 유연성이 떨어지지만, 대부분의 경우에 사용할 수 있는 API이다.

channel_1 = 32
channel_2 = 16
learning_rate = 1e-2

model = None
optimizer = None

################################################################################
# TODO: Rewrite the 2-layer ConvNet with bias from Part III with the           #
# Sequential API.                                                              #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
model=nn.Sequential(
    nn.Conv2d(3,channel_1, kernel_size=(5,5),padding=2),
    nn.ReLU(),
    nn.Conv2d(channel_1,channel_2,kernel_size=(3,3),padding=1),
    nn.ReLU(),
    Flatten(),
    nn.Linear(channel_2*32*32,10)
)
optimizer=optim.SGD(model.parameters(), lr=learning_rate,
                     momentum=0.9, nesterov=True)
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

train_part34(model, optimizer)

이전의 방법들보다 훨씬 간단해진 것을 확인할 수 있다. nn.Sequential을 사용할 때는 nn.functional이 아니라 nn.을 이용한다. nn.Module에서 사용했던 optimizer와 train_part34는 그대로 사용한다.

4. CIFAR-10 훈련시켜 validation acc. 70% 이상 만들기

이번엔 위에서 배운 내용들로 내가 모델의 architecture와 hyperparameter들을 사용해서 validation accuracy를 70% 이상으로 만들어보자. 사람마다 모델의 구조는 다르게 설계하겠지만, 나는 5-layer network를 만들었고, 구조는 (CONV-RELU-BATCHNORM) x2 -CONV-RELU-FC-RELU-FC 의 구조였다. optimizer는 가장 많이 쓰이는 adam을 적용하고, learning rate= 5e-4로 설정했다. 구체적인 구조는 아래 코드와 같다.

model = None
optimizer = None
channel_1=64
channel_2=32
channel_3=16
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
model=nn.Sequential(
    nn.Conv2d(3,channel_1, kernel_size=(7,7),padding=3),
    nn.ReLU(),
    nn.BatchNorm2d(channel_1,),
    nn.Conv2d(channel_1,channel_2,kernel_size=(5,5),padding=2),
    nn.ReLU(),
    nn.BatchNorm2d(channel_2,),
    nn.Conv2d(channel_2,channel_3,kernel_size=(3,3),padding=1),
    nn.ReLU(),
    nn.Flatten(),
    nn.Linear(channel_3*32*32,32),
    nn.ReLU(),
    nn.Linear(32,10)
)
optimizer=optim.Adam(model.parameters(), lr=5e-4,
                     betas=(0.9,0.999))

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                                 END OF YOUR CODE                             #
################################################################################

# You should get at least 70% accuracy.
# You may modify the number of epochs to any number below 15.
train_part34(model, optimizer, epochs=10)

이렇게 모델을 훈련시킨 결과, 71.7% 의 best validation accuracy를 얻었고, test set에서는 67.76%의 accuracy를 얻을 수 있었다.

내 풀이 링크:

https://github.com/danlee0113/cs231n

이준학

AI/ Computer Vision

이전 포스트

cs231n 과제2 Q4- Convolutional Neural Networks

다음 포스트