[Pytorch] 이미지 분류기 구현 - PyTorch Module API

윤형준·2022년 9월 4일

PyTorch Module API

Barebone PyTorch에서는 모든 파라미터 텐서들을 직접 관리했습니다. 작은 네트워크일 경우일때는 괜찮지만, 네트워크가 커질수록 직접 파라미터 하나하나 정의하고 관리하기에는 불가능합니다.

PyTorch는 nn.Module API를 제공하여 직접 임의의 네트워크를 정의할 수 있고 학습 가능한 파라미터를 자동으로 추적할 수 있게 도와줍니다. Part II에서는 SGD를 직접 구현했지만 PyTorch는 torch.optim 패키지를 제공하여 SGD와 더불어 다양한 optimizer를 사용할수 있게 합니다. 다음의 자료 doc를 참고하여 다양한 optimizer의 정의를 살펴보길 바랍니다.

Module API를 사용하기 위해서 아래의 step을 따라야 합니다:

Subclass nn.Module. nn.Module를 상속받아 TwoLayerFC와 같은 직관적인 이름으로 네트워크 클래스 정의.
정의한 클래스의 init()에서 모델을 구성하는 모든 레이어에 대해서 정의합니다. nn.Linear와 nn.Conv2d는 nn.Module의 subclasses로 학습 가능한 파라미터를 포함하고 있어 별도로 Tensor를 초기화하지 않아도 됩니다. 다양한 builtin 레이어들에 대해 공부하고 싶다면 다음 자료 doc를 참고시길 바랍니다. Warning: 클래스 정의시 super().init()를 가장 먼저 호출합니다.
forward() method에서는 네트워크 내 레이어들의 연결들을 정의해주어야 합니다. 앞선 init에서 정의한 레이어들을 입력과 출력 shape에 맞는 레이어들로 연결해줍니다. forward()에서는 새로운 학습 가능한 파라미터를 생성하면 안됩니다. 모든 파라미터 생성은 init에서 만들어져야 합니다.

Module API: Two-Layer Network
아래는 2개의 레이어를 갖는 fully-connected 네트워크의 구체적인 예시입니다.

class TwoLayerFC(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super().__init__()
        # 레이어 2개를 정의합니다.
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, num_classes)

        # nn.init 패키지 내에서 초기화 함수를 사용합니다.
        # http://pytorch.org/docs/master/nn.html#torch-nn-init 
        nn.init.kaiming_normal_(self.fc1.weight)
        nn.init.kaiming_normal_(self.fc2.weight)
    
    def forward(self, x):
        # forward 에서는 레이어의 연결을 정의합니다.
        x = flatten(x)
        scores = self.fc2(F.relu(self.fc1(x)))
        return scores

def test_TwoLayerFC():
    input_size = 50
    x = torch.zeros((64, input_size), dtype=dtype)  # minibatch size 64, feature dimension 50
    model = TwoLayerFC(input_size, 42, 10)
    scores = model(x)
    print(scores.size())  # you should see [64, 10]

test_TwoLayerFC()

Module API: Three-Layer ConvNet

이제 여기서는 3개의 컨볼루션 레이어와 fully-connected 레이어를 갖는 ConvNet를 직접 구현해 봅니다. 네트워크의 구조는 Part II에서 정의한 것과 동일합니다:

Convolutional layer with channel_1 5x5 filters with zero-padding of 2
ReLU
Convolutional layer with channel_2 3x3 filters with zero-padding of 1
ReLU
Fully-connected layer to num_classes classes
Kaiming normal initialization method를 활용하여 정의한 레이어들을 초기화 합니다.

HINT: http://pytorch.org/docs/stable/nn.html#conv2d

ConvNet를 구현한 이후, test_ThreeLayerConvNet 함수를 실행하면(64, 10) shape 의 output score를 출력하게 됩니다.

Pytorch functions
nn.Conv2d : torch.nn.Conv2d(in_channels: int, out_channels: int, kernel_size, stride = 1, padding = 0)
nn.Maxpool2d : torch.nn.MaxPool2d(kernel_size, stride = None, padding = 0)
nn.Linear : torch.nn.Linear(in_features: int, out_features: int, bias: bool = True)
references

https://tutorials.pytorch.kr/beginner/examples_nn/two_layer_net_module.html

https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

https://pytorch.org/docs/stable/generated/torch.nn.Linear.html

tips

http://taewan.kim/post/cnn/

http://taewan.kim/post/cnn/#4-cnn-%EC%9E%85%EC%B6%9C%EB%A0%A5-%ED%8C%8C%EB%A6%AC%EB%AF%B8%ED%84%B0-%EA%B3%84%EC%82%B0

https://mrsyee.github.io/image%20processing/2018/11/28/cnn_technique/

# Req. 1-4	Three-Layer ConvNet 클래스를 Module API를 활용하여 정의하기

class ThreeLayerConvNet(nn.Module):
    def __init__(self, in_channel, channel_1, channel_2, num_classes):
        super().__init__()
        ########################################################################
        # TODO: Set up the layers you need for a three-layer ConvNet with the  #
        # architecture defined above.                                          #
        ########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        self.conv1=nn.Conv2d(in_channel,channel_1,[5,5],padding=2)
        nn.init.kaiming_normal_(self.conv1.weight)
        self.conv2=nn.Conv2d(channel_1,channel_2,[3,3],padding=1)
        nn.init.kaiming_normal_(self.conv2.weight)
        self.fc=nn.Linear(channel_2*32*32,num_classes)
        nn.init.kaiming_normal_(self.fc.weight)

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ########################################################################
        #                          END OF YOUR CODE                            #       
        ########################################################################

    def forward(self, x):
        scores = None
        ########################################################################
        # TODO: Implement the forward function for a 3-layer ConvNet. you      #
        # should use the layers you defined in __init__ and specify the        #
        # connectivity of those layers in forward()                            #
        ########################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

        x1=self.conv1(x)
        r1=nn.functional.relu(x1)
        x2=self.conv2(r1)
        r2=nn.functional.relu(x2)
        scores=self.fc(r2.view([-1,np.prod(r2.shape[1:])]))

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        ########################################################################
        #                             END OF YOUR CODE                         #
        ########################################################################
        return scores


def test_ThreeLayerConvNet():
    x = torch.zeros((64, 3, 32, 32), dtype=dtype)  # minibatch size 64, image size [3, 32, 32]
    model = ThreeLayerConvNet(in_channel=3, channel_1=12, channel_2=8, num_classes=10)
    scores = model(x)
    print(scores.size())  # you should see [64, 10]
test_ThreeLayerConvNet()

Module API: Check Accuracy

Validation이나 test set이 주어졌을 때 분류 정확도를 측정합니다.

해당 버전은 수동으로 파라미터를 전달했던 part II와는 약간 다릅니다.

# Req. 1-5	Module API에서 성능 평가 함수 구현하기

def check_accuracy_part34(loader, model):
    if loader.dataset.train:
        print('Checking accuracy on validation set')
    else:
        print('Checking accuracy on test set')   
    num_correct = 0
    num_samples = 0
    model.eval()  # set model to evaluation mode
    
    ########################################################################
    # TODO: Implement the function for evaluating the accuracy of the model#
    ########################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    with torch.no_grad():
        for x, y in loader:
            x = x.to(device=device, dtype=dtype)  # move to device, e.g. GPU
            y = y.to(device=device, dtype=torch.long)
            scores = model(x)
            _, preds = scores.max(1)
            num_correct += (preds == y).sum()
            num_samples += preds.size(0)
        acc = float(num_correct) / num_samples
        print('Got %d / %d correct (%.2f)' % (num_correct, num_samples, 100 * acc))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ########################################################################
    #                          END OF YOUR CODE                            #       
    ########################################################################

Module API: Training Loop

학습 loop를 작성합니다. 직접 파라미터를 업데이트하기 위한 코드를 작성하지 않고, torch.optim 패키지 내의 optimizer를 사용하여 자동으로 파라미터를 업데이트 해줍니다.

# Req. 1-6	Module API에서 학습 loop 구현하기

def train_part34(model, optimizer, epochs=1):
    """
    Train a model on CIFAR-10 using the PyTorch Module API.
    
    Inputs:
    - model: A PyTorch Module giving the model to train.
    - optimizer: An Optimizer object we will use to train the model
    - epochs: (Optional) A Python integer giving the number of epochs to train for
    
    Returns: Nothing, but prints model accuracies during training.
    """
    model = model.to(device=device)  # move the model parameters to CPU/GPU
    ########################################################################
    # TODO: Implement the training loop                                    #
    ########################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    for e in range(epochs):
        for t, (x, y) in enumerate(loader_train):
            model.train()  # put model to training mode
            x = x.to(device=device, dtype=dtype)  # move to device, e.g. GPU
            y = y.to(device=device, dtype=torch.long)

            scores = model(x)
            loss = F.cross_entropy(scores, y)

            # Zero out all of the gradients for the variables which the optimizer
            # will update.
            optimizer.zero_grad()

            # This is the backwards pass: compute the gradient of the loss with
            # respect to each  parameter of the model.
            loss.backward()

            # Actually update the parameters of the model using the gradients
            # computed by the backwards pass.
            optimizer.step()

            if t % print_every == 0:
                print('Iteration %d, loss = %.4f' % (t, loss.item()))
                check_accuracy_part34(loader_val, model)
                print()

    # 1) 첫번째 for문으로 epochs 만큼 반복
    # 2) 두번째 for문으로 trainset이 저장되어 있는 loader_train에서 배치 사이즈 만큼씩 data load
    # 3) load한 data에서 input 값과 label을 device에 올림 (GPU or CPU)
    # 4) model에 input값을 입력하여 forward 패스 수행
    # 5) loss function으로 예측값과 label 비교
    # 6) optimizer에서 gradient 값 0으로 초기화
    # 7) loss 값 backpropagation 하여 gradient 계산
    # 8) Optimizer 업데이트
    # 9) loss와 accuracy를 print_every 주기 마다 출력
    # 10) 2)로 돌아가 반복 한뒤 2)가 모두 마치면 1)로 돌아가 반복

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ########################################################################
    #                          END OF YOUR CODE                            #       
    ########################################################################

Module API: Train a Two-Layer Network

이제 학습 loop 실행을 시작합니다. Part II와 다르게 파라미터를 직접 정의하지 않아도 됩니다.

Input size, hidden 레이어 size, output size(클래스 개수)를 입력하여 TwoLayerFC의 객체를 생성합니다.

또한 TwoLayerFC의 학습 가능한 파라미터를 추적하기 위한 optimizer 또한 정의합니다.

별도로 hyperparameters를 수정하지 않고도 한 에폭 이후 40% 이상의 분류 정확도를 보이면 성공입니다.

hidden_layer_size = 4000
learning_rate = 1e-2
model = TwoLayerFC(3 * 32 * 32, hidden_layer_size, 10)
optimizer = optim.SGD(model.parameters(), lr=learning_rate)

train_part34(model, optimizer)

Module API: Train a Three-Layer ConvNet

이제 Module API를 사용하여 직접 three-layer ConvNet을 설계합니다. 앞선 two-layer 네트워크와 비슷할 것 입니다. 별도로 hyperparameters를 수정하지 않고도 한 에폭 이후 40% 이상의 분류 정확도를 보이면 성공입니다. 모델 학습 시 stochastic gradient descent를 사용합니다.

# Req. 1-7	ThreeLayerConvNet의 instance만들고 optimizer 정의

learning_rate = 3e-3
channel_1 = 32
channel_2 = 16

model = None
optimizer = None
################################################################################
# TODO: Instantiate your ThreeLayerConvNet model and a corresponding optimizer #
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

model=ThreeLayerConvNet(3,channel_1,channel_2,10)
optimizer=optim.SGD(model.parameters(),lr=learning_rate)

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                                 END OF YOUR CODE                             
################################################################################

train_part34(model, optimizer)

윤형준

매일 조금씩 성장하는 개발자

이전 포스트

[Pytorch] 이미지 분류기 구현 - Barebones PyTorch

다음 포스트