https://discuss.pytorch.org/t/module-children-vs-module-modules/4551/11
https://pytorch.org/vision/stable/models.html
pre-trained parameter를 가지고 오는데 import torchvision.models as models
에 있는 model만 사용한다. 다른 parameter를 사용할 수 도 있지만 검증의 문제가 있다. 물론 방법은 같다.
기본적으로 model과 model + pretrained parameters 를 불러 올 수 있다.
이때 model만 load하고 state_dict()
을 찍어 보면 parameter가 있는 것을 알 수 있는데 이는 pretrained parameter가 아니라 model 안의 각 layer에서의 initial weights이다. 주의하자.
Load model
vgg16 = models.vgg16(pretrained=True)
다음
vgg16
model의 구조가 나온다.
vgg16.state_dict()
pretrained parameters를 볼 수 있다.
일단 loaded model의 구조를 보는것은 위에 처럼 한다. easy
이제 변경을 해야한다. 대표적인 예로는 VGG에서 마지막 FCN의 class를 원하는 dataset에 맞게 바꾸줘야 한다.
그럼 loaded model안의 layer에 접근 해야 한다.
이전 post에서 freeze를 할때는 model.named_parameters()
를 통하여 parameter에 접근하였다. 여기서는 parameter가 아닌 layer
에 접근해야 한다. 이때는 두가지의 code를 통하여 접근한다. model.named_children()
and model.named_modeules()
두가지의 차이를 보자
class test_model(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(in_channels = 3, out_channels = 50, kernel_size = 3, stride=1, padding=1, bias = False) self.maxpool1 = nn.MaxPool2d(2, 2) self.bn1 = nn.BatchNorm2d(50) self.fc = nn.Sequential(Flatten(), nn.Linear(50*7*7, 100), nn.ReLU(), nn.Linear(100,100)) def forward(self, inputs): feature_map1 = self.conv1(inputs) feature_map1 = self.maxpool1(feature_map1) feature_map1 = self.bn1(feature_map1) output = self.fc(feature_map5) return output test_model = test_model()
model.names_children()
이다.for children_name, children in test_model.named_children(): print('module name : ',children_name) print('='*80) print(children)
module name : conv1 ================================================================================ Conv2d(3, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) module name : maxpool1 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : bn1 ================================================================================ BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) module name : fc ================================================================================ Sequential( (0): Flatten() (1): Linear(in_features=2450, out_features=100, bias=True) (2): ReLU() (3): Linear(in_features=100, out_features=100, bias=True) )
이를 보면 마지막 layer는 4개의 layer가 한번에 묶여 있는 것을 볼 수 있다.
이는 model을 만들때 마지막의 4개는nn.Sequential
로 미리 묶어 놓았기 때문이다.
즉, 처음에 묶어 놨으면 그 이하로는 접근 할 수 없는 듯 하다.
model.named_modules()
for module_name, module in test_model.named_modules(): print('module name : ',module_name) print('='*80) print(module)
module name : ================================================================================ test_model( (conv1): Conv2d(3, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (maxpool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (bn1): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (fc): Sequential( (0): Flatten() (1): Linear(in_features=2450, out_features=100, bias=True) (2): ReLU() (3): Linear(in_features=100, out_features=100, bias=True) ) ) module name : conv1 ================================================================================ Conv2d(3, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) module name : maxpool1 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : bn1 ================================================================================ BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) module name : fc ================================================================================ Sequential( (0): Flatten() (1): Linear(in_features=2450, out_features=100, bias=True) (2): ReLU() (3): Linear(in_features=100, out_features=100, bias=True) ) module name : fc.0 ================================================================================ Flatten() module name : fc.1 ================================================================================ Linear(in_features=2450, out_features=100, bias=True) module name : fc.2 ================================================================================ ReLU() module name : fc.3 ================================================================================ Linear(in_features=100, out_features=100, bias=True)
이의 결과를 보면 앞과 마찬가지로
nn.Sequential
은 묶여 있지만 이를 더 분리해 주어 하위layer까지 접근한다. torchvision.models은 대부분 이렇게 구성 되어 있을 것이다. 따라서model.named_modules()
을 사용하기로 한다.
마찬가지로 vgg19의 layer들을 보자.
for module_name, module in vgg16.named_modules(): print('module name : ',module_name) print('='*80) print(module)
module name : ================================================================================ VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU(inplace=True) (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU(inplace=True) (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (29): ReLU(inplace=True) (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) (avgpool): AdaptiveAvgPool2d(output_size=(7, 7)) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) ) ) module name : features ================================================================================ Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU(inplace=True) (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU(inplace=True) (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (29): ReLU(inplace=True) (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) module name : features.0 ================================================================================ Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.1 ================================================================================ ReLU(inplace=True) module name : features.2 ================================================================================ Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.3 ================================================================================ ReLU(inplace=True) module name : features.4 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : features.5 ================================================================================ Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.6 ================================================================================ ReLU(inplace=True) module name : features.7 ================================================================================ Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.8 ================================================================================ ReLU(inplace=True) module name : features.9 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : features.10 ================================================================================ Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.11 ================================================================================ ReLU(inplace=True) module name : features.12 ================================================================================ Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.13 ================================================================================ ReLU(inplace=True) module name : features.14 ================================================================================ Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.15 ================================================================================ ReLU(inplace=True) module name : features.16 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : features.17 ================================================================================ Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.18 ================================================================================ ReLU(inplace=True) module name : features.19 ================================================================================ Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.20 ================================================================================ ReLU(inplace=True) module name : features.21 ================================================================================ Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.22 ================================================================================ ReLU(inplace=True) module name : features.23 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : features.24 ================================================================================ Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.25 ================================================================================ ReLU(inplace=True) module name : features.26 ================================================================================ Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.27 ================================================================================ ReLU(inplace=True) module name : features.28 ================================================================================ Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) module name : features.29 ================================================================================ ReLU(inplace=True) module name : features.30 ================================================================================ MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) module name : avgpool ================================================================================ AdaptiveAvgPool2d(output_size=(7, 7)) module name : classifier ================================================================================ Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) (6): Linear(in_features=4096, out_features=1000, bias=True) ) module name : classifier.0 ================================================================================ Linear(in_features=25088, out_features=4096, bias=True) module name : classifier.1 ================================================================================ ReLU(inplace=True) module name : classifier.2 ================================================================================ Dropout(p=0.5, inplace=False) module name : classifier.3 ================================================================================ Linear(in_features=4096, out_features=4096, bias=True) module name : classifier.4 ================================================================================ ReLU(inplace=True) module name : classifier.5 ================================================================================ Dropout(p=0.5, inplace=False) module name : classifier.6 ================================================================================ Linear(in_features=4096, out_features=1000, bias=True)
순서대로 전체구조 -- nn.sequential로 묶여 있는 구조 - 각 sequential별 세부구조 이다.