이상한 pytorch5: 멋있게 층 쌓기

ddang ddang ball·2022년 10월 23일

파이토치

목록 보기

5/6

pytorch에서 실력 별 모델을 어떻게 정의하는지 한번 해보겠다.
FashionMNIST를 대상으로 모델을 만들어 보았다.

하(Noob)

class mymodel(nn.Module):
	def __init__(self):
    	super().__init__()
        self.layer1 = nn.Linear(10, 3)
        self.layer2 = nn.Linear(3,20)
        .
        .
        .

일일히 정의였다. MNIST로 숫자를 구분한다면 이렇게 쌓을 확률이 5000%이다.
근데 순방향 전파를 정의할 때, 저거 일일히 하나하나 열심히 넣어야 한다.
(필자도 forward에 layer하나 빼먹어서 모델이 안돌아간적이 허다했다.)

100개의 층일 때 어떻게 할것인가? self.layer1, self.layer2, self.layer3 ... self.layer100? 이렇게 정성스러운 노가다를 할 것인가?

만약 모델을 정의할 때 처음부터 끝까지 이렇게 정의해야 한다면 레즈넷은 존재하지 않았을 것이다.

중(Pro)

중하

class mymodel(nn.Module):
	def __init__(self):
    	super().__init__()
        self.layer1 = nn.Sequential(nn.Linear(10, 3),
        							nn.Linear(3,20),                        
       								.
       							    .
        							.)

이건 나쁘지 않다. 순방향 전파를 정할 때 위의 경우보다는 덜 고생할 것이다.

하지만! 반복되는 블럭이 많다면 아래처럼 쌓는 것이 정신건강에 좋다.

중상

이런 식으로 말이다.

class Block(nn.Module):
	def __init__(self, input_dim, hidden_dim):
    	super().__init__()
        self.block1 = nn.Sequential(nn.Conv2d(input_dim, hidden_dim, 3),
        							nn.BatchNorm2d(hidden_dim),
                                    nn.ReLU())
	def forward(self, x):
    	return self.block1(x)

블럭을 먼저 정의한뒤

class mymodel(nn.Module):
	def __init__(self):
    	super().__init__()
        self.block1 = Block(1, 3)
        self.block2 = Block(3, 15)
        .
        .

모델을 정의할 때 블럭을 넣는 것이 훨씬 보기 깔끔하고 좋다.(그리고 멋있어 보인다.)

class Block(nn.Module):
  def __init__(self, input_dim, hidden_dim):
    super().__init__()
    self.block1 = nn.Sequential(nn.Conv2d(input_dim, hidden_dim, 3),
                                nn.BatchNorm2d(hidden_dim),
                                nn.ReLU())
  def forward(self, x):
    return self.block1(x)

class model1(nn.Module):
  def __init__(self):
    super().__init__()
    self.block1 = Block(1, 3)
    self.block2 = Block(3, 5)
    self.linear = nn.Linear(2880, 10)
    
  def forward(self, x):
    x = self.block1(x)
    x = self.block2(x)
    x = torch.flatten(x, start_dim = 1)
    x = self.linear(x)
    return x

conv - batch - relu를 진행하는 블럭을 정의한 뒤 모델을 만들 때, 사용해보았다.

하지만 이렇게 하면 문제점이 하나 더 있다.
층의 개수별로 모델의 정확도를 확인하고 싶으면 어떻게 해야할까??
열심히 모델 뜯어 고치고 결과보는 것을 반복할 것인가???

상(Hacker)

이 질문에 답하기 위해 엄청난 방법을 가져왔다!!
알고리즘은 이러하다.

블럭을 정의한다.
모델안에 빈리스트를 만든다.
그 리스트에 원하는 만큼 블럭을 자동으로 넣는다.
완성된 리스트에 nn.Sequential(*리스트 이름)하면 끝!

주의사항)
1. nn.Sequential(*layers) 리스트 이름에 별을 꼭! 붙여야 한다.
2. model을 정의할 때 꼭! 리스트를 넣어야 한다.

layer_list = [10, 20, 30, 40 ...]
model = model1(layer_list)

전체 코드는 이러했다.

class Block(nn.Module):
  def __init__(self, input_dim, hidden_dim):
    super().__init__()
    self.block1 = nn.Sequential(nn.Conv2d(input_dim, hidden_dim, 3),
                                nn.BatchNorm2d(hidden_dim),
                                nn.ReLU())
  def forward(self, x):
    return self.block1(x)

블럭을 정의하고

class model1(nn.Module):
  def __init__(self, layer_list):
    super().__init__()
    layer = []
    for i in range(1, len(layer_list)):
      layer.append(Block(layer_list[i]-1, layer_list[i]))
    self.conv = nn.Sequential(*layer)
    self.linear = nn.Linear(1944, 10)
  def forward(self, x):
    x = self.conv(x)
    x = torch.flatten(x, start_dim = 1)
    x = self.linear(x)
    return x

리스트에 블럭을 쌓고 sequential로 포장

    layer = []
    for i in range(1, len(layer_list)):
      layer.append(block(layer_list[i]-1, layer_list[i]))
    self.conv = nn.Sequential(*layer)

층을 쌓는 부분만 가져와 보았다. 빈 리스트를 정의한 다음
layer_list의 개수만큼 블럭을 넣는 것을 볼 수 있다.
블럭의 input_dim의 경우 layer_list의 이전 원소가 들어가며
블럭의 output_dim의 경우 layer_list의 현재 원소가 들어간다.

layer이란 리스트에는

block(layer_list[0], layer_list[1]),
block(layer_list[1], layer_list[2]),
block(layer_list[2], layer_list[3]),
.
.
.
block(layer_list[마지막에서 2번째 원소], layer_list[마지막 원소])
이런식으로 블럭이 차례대로 들어가게 된다.
이렇게 쌓은 뒤, nn.Sequential(*layer)을 하게 됨으로 써
layer안에 들어있는 블럭들이 돌아갈 수 있게 된다.

Q) 왜 함수의 입력에 '*' 을 붙였나요?
nn.Sequential은 함수의 매개변수를 유동적으로 받을 수 있다. 하지만 리스트 자체가 들어가지 않고 '튜플' 형태로 변환하여 넣어야 한다. 그래서 리스트앞에 *을 붙여야 한다.

https://ssungkang.tistory.com/entry/python-%EC%96%B8%ED%8C%A8%ED%82%B9-args-kwargs

layer_list = [1, 2, 3, 4, 5, 6]
ss = model1(layer_list)
print(ss)

이런식으로 리스트만 바꾸면 모델을 쉽게 바꿀 수 있다.
+) resnet 구현할 때, __makelayer__이란 멋진 함수를 통해서 블럭을 정의한다.
그 함수를 살펴보면 블럭을 쌓고 nn.Sequential(*layers)이런 식으로 Sequential을 반환하였다.
나중에 시간이 되면 한번 올려보도록 하겠다.

+) OrderdDict
몇몇 고인물들은 순서대로 원소를 입력 받는 딕셔너리를 활용하여 층을 쌓는다고 한다.

from collections import OrderedDict

class model1(nn.Module):
  def __init__(self):
    super().__init__()
    layer = OrderedDict([])
    for i in range(5):
      layer.update({f"layer{i}": nn.Linear(3, 3)})
    print(layer)
    self.layers = nn.Sequential(layer)
  def forward(self,x):
    x = self.layers(x)
    return x

이런 방식의 좋은 점은 우리의 딥러닝 블럭에 사랑스러운 이름(?)을 붙여줄 수 있다.

???

(10/31일자)
오늘 되게 신기하게 블럭을 정의하는 법을 발견하였다.

class conv_block(nn.Sequential):
	def __init__(self, input_dim, hidden_dim, kernel_size = 3, stride = 1, padding = None,
    			norm_layer = None, activation_layer = None)
    	super(block, self).__init__(
        nn.Conv2d(input_dim, hidden_dim, kernel_size, stride, padding, groups, bias = True),
        norm_layer(hidden_dim),
        activation_layer(hidden_dim)
        )

이런식으로 정의하는 경우도 있다! nn.Sequential을 상속받아서 super에다 다 때려박으니 잘 돌아갔다. 놀랍지 아니한가?

저렇게 정의한 블럭은 이런식으로 사용이 된다.

input_dim = 1
hidden_dim = 3
kernel_size = 3
stride = 1
padding = "same"
norm_layer = nn.BatchNorm2d
activation_layer = nn.ReLU

block1 = conv_block(input_dim, hidden_dim, kernel_size, stride,
					padding, norm_layer, activation_layer)

이런 식으로 블럭을 정의하면 된다. 이때 주의사항은 norm_layer은 nn.BatchNorm2d로 뒤에 괄호를 넣지 않는다는 점이다! 활성함수 층도 마찬가지!

super()안에 norm_layer(hidden_dim), activation_layer(hidden_dim)이런식으로 정의해놨기 때문이다.

ddang ddang ball

배우고 싶은것은 많으나 용두사미인 사람.

이전 포스트

이상한 pytorch4: 숨겨진 optimizer

다음 포스트