[Week 2] 과제 #1

혜 콩·2022년 9월 27일

네이버 부스트캠프 U Stage

목록 보기

5/14

🚩 Model의 구조 (실행 순서)

import torch
from torch import nn
from torch.nn.parameter import Parameter

# Function
class Function_A(nn.Module):
    def __init__(self, name):
        super().__init__()
        self.name = name

    def forward(self, x):
        x = x * 2
        return x

class Function_B(nn.Module):
    def __init__(self):
        super().__init__()
        self.W1 = Parameter(torch.Tensor([10]))
        self.W2 = Parameter(torch.Tensor([2]))

    def forward(self, x):
        x = x / self.W1
        x = x / self.W2

        return x

class Function_C(nn.Module):
    def __init__(self):
        super().__init__()
        self.register_buffer('duck', torch.Tensor([7]), persistent=True)

    def forward(self, x):
        x = x * self.duck
        
        return x

class Function_D(nn.Module):
    def __init__(self):
        super().__init__()
        self.W1 = Parameter(torch.Tensor([3]))
        self.W2 = Parameter(torch.Tensor([5]))
        self.c = Function_C()

    def forward(self, x):
        x = x + self.W1
        x = self.c(x)
        x = x / self.W2

        return x


# Layer
class Layer_AB(nn.Module):
    def __init__(self):
        super().__init__()

        self.a = Function_A('duck')
        self.b = Function_B()

    def forward(self, x):
        x = self.a(x) / 5
        x = self.b(x)

        return x

class Layer_CD(nn.Module):
    def __init__(self):
        super().__init__()

        self.c = Function_C()
        self.d = Function_D()

    def forward(self, x):
        x = self.c(x)
        x = self.d(x) + 1

        return x


# Model
class Model(nn.Module):
    def __init__(self):
        super().__init__()

        self.ab = Layer_AB()
        self.cd = Layer_CD()

    def forward(self, x):
        x = self.ab(x)
        x = self.cd(x)

        return x

x = torch.tensor([7])

model = Model()
model(x)

[ ] = 실행 순서

x = tensor 7

layer_ab(x)
[1] duck이라는 name의 function_a __init__ 실행
[2] function_b __init__ 실행
[5] x = function_a 의 forward() 연산 실행 / 5
[6] x = function_b 의 forward() 연산 실행

layer_cd(x)
[3] duck 버퍼 저장하는 function_a __init__ 실행
[4] function_d __init__ 실행
[7] x = function_c 의 forward() 연산 실행
[8] x = function_d 의 forward() 연산 실행 + 1

각 함수들이 처음 호출되면 __init__이 실행되고 다음 함수를 실행하기 전인 대기중 상태에 들어간다.

🍏 W와 b는 왜 Parameter로 지정해줘요?

W와 b를 파라미터로 지정해주면, 필요할 때 값을 계속 가져와 쓸 수 있고
미분도 가능하다.
하지만 Tensor로 지정해주면, 계산은 파라미터와 동일하게 잘 수행하겠지만 미분이 불가능하고 값이 업데이트가 되지 않는다. 또한, 모델을 저장할 때 텐서값은 함께 저장되지 않아 무시된다.

🚩 Buffer

Module에 저장해 놓고 사용하는 Tensor의 일종으로 학습을 통해 계산되지 않는 Tensor

🍏 모듈 or 모델 안에 존재하는 파라미터 / 버퍼 알아내기

module.named_buffers()
model.buffers()
이름을 통해 특정 buffer 가져오기: get_buffer("name")

for name, buffer in model.named_buffers():
    print(f"[ Name ] : {name}\n[ Buffer ] : {buffer}")
    print("-" * 30)
    
>>>    
[ Name ] : cd.c.duck
[ Buffer ] : tensor([7.])
------------------------------
[ Name ] : cd.d.c.duck
[ Buffer ] : tensor([7.])
------------------------------


# TODO : Function_C에 속하는 Buffer를 가져오세요!
buffer = model.get_buffer("cd.c.duck")

🍏 내 모델 안에 어떤 모듈들이 있었는지 기억이 안 나요!

for name, module in model.named_modules():
    print(f"[ Name ] : {name}\n[ Module ]\n{module}")
    print("-" * 30)
    
>>>
[ Name ] : 
[ Module ]
Model(
  (ab): Layer_AB(
    (a): Function_A()
    (b): Function_B()
  )
  (cd): Layer_CD(
    (c): Function_C()
    (d): Function_D(
      (c): Function_C()
    )
  )
)
------------------------------
[ Name ] : ab
[ Module ]
Layer_AB(
  (a): Function_A()
  (b): Function_B()
)
------------------------------
[ Name ] : ab.a
[ Module ]
Function_A()
------------------------------
[ Name ] : ab.b
[ Module ]
Function_B()
------------------------------
[ Name ] : cd
[ Module ]
Layer_CD(
  (c): Function_C()
  (d): Function_D(
    (c): Function_C()
  )
)
------------------------------
[ Name ] : cd.c
[ Module ]
Function_C()
------------------------------
[ Name ] : cd.d
[ Module ]
Function_D(
  (c): Function_C()
)
------------------------------
[ Name ] : cd.d.c
[ Module ]
Function_C()
------------------------------

🍏 hook

프로그램, 혹은 특정 함수 실행 후에 걸어놓는 경우

- forward_pre_hooks
- forward_hooks
- full_backward_hooks
- state_dict_hooks                  # used internally

🍏 apply

import torch
from torch import nn

@torch.no_grad()
def init_weights(m):
    print('module:', m)
    if type(m) == nn.Linear:
        m.weight.fill_(1.0)
        print('linear apply:', m.weight)
    elif type(m) == nn.Sequential:
      print('It is sequential')

net = nn.Sequential(nn.Linear(5, 2), nn.Linear(2, 2))
print('------apply start------')
net.apply(init_weights)
print('---------end----------')

>>>
------apply start------
module: Linear(in_features=5, out_features=2, bias=True)
linear apply: Parameter containing:
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]], requires_grad=True)
module: Linear(in_features=2, out_features=2, bias=True)
linear apply: Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
module: Sequential(
  (0): Linear(in_features=5, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
It is sequential
---------end----------

Sequential 안의 nn.Linear(5, 2), nn.Linear(2, 2)에만 apply 함수가 적용되는 게 아니라 Sequential 그 자체(self)에도 적용하는 것을 확인할 수 있다.

✍🏻 회고

1번째 과제가 이틀에 걸쳐 끝났다... Step by Step 으로 차근차근 공부할 수 있었지만 오로지 docs만 보고 이해하려니 많은 시간이 걸렸다.
외우지는 못하더라도 완벽히 이해하고 넘어가고 싶어서 질문도 많이 하고 정리도 하면서 나의 이해를 도왔다. 시간은 오래 걸렸지만 이틀동안 많이 성장한 기분이라 뿌듯했다!

혜 콩

배우고 싶은게 많은 개발자📚

이전 포스트

[Week 2] 과제 #1

네이버 부스트캠프 U Stage

🚩 Model의 구조 (실행 순서)

🍏 W와 b는 왜 Parameter로 지정해줘요?

🚩 Buffer

🍏 모듈 or 모델 안에 존재하는 파라미터 / 버퍼 알아내기

🍏 내 모델 안에 어떤 모듈들이 있었는지 기억이 안 나요!

🍏 hook

🍏 apply

✍🏻 회고

[Week 2] Pytorch

0개의 댓글

관련 채용 정보