Modern CNN

yst3147·2022년 2월 13일

Boostcamp AI tech DL Basic

0

[부스트캠프 ai tech] DL Basic

목록 보기

5/10

공부 내용

AlexNet
VGGNet
GoogLeNet
ResNet
DenseNet
Summary

AlexNet

ILSVRC 대회에서 1등 차지한 모델 -> 사람보다 높은 성능
5 convolutional layers와 3 dense layers로 구성

ILSVRC

ImageNet Large-Scale Visual Recognition Challenge
- Classification / Detection / Localization / Segmentation 등 다양한 task
- 1,000개의 다른 category
- 백만개가 넘는 이미지
- Train set 약 456,000 개의 이미지로 구성

핵심 아이디어

Rectified Linear Unit(ReLU) activation 사용
-> gradient vanishing 문제 해결에 유용
GPU implementation (2개 활용)
Local response normalization(요즘은 잘 안 씀)
-> 입력 공간에서 몇개 큰 response 없앰
Overlapping pooling
Data augmentation
Dropout

ReLU Activation

linear model의 성질 유지 -> 양수일 때 기울기 일정
gradient descent로 최적화 쉬움
generalization이 잘 됨
gradient vaniishing 문제 극복

VGGNet

3 $\times$ 3 convolution filters(stride 1) 사용해서 depth 늘림
1 $\times$ 1 convolution을 fully connected layer에 사용
-> 파라미터 감소 효과
Dropout 활용
VGG16, VGG19 등 존재

3 $\times$ 3 convolution 사용 이유

파라미터 수가 줄어든다.
- 파라미터 수 : 5 $\times$ 5 convolution을 1번 사용 >> 3 $\times$ 3 convolution을 2번 사용
  -> output 크기는 같음

GoogLeNet

22 layers로 구성

2014년에 ILSVRC 대회 우승
- network-in-network(NIN)과 inception block 잘 활용
  - network-in-network : 비슷하게 보이는 network가 여러번 반복
AlexNet, VGGNet과 비교했을 때 파라미터 수 제일 적다.
- AlexNet(8-layers) : 60M
- VGGNet(19-layers) : 110M
- GoogLeNet(22-layers) : 4M

Inception blocks

하나의 입력이 들어왔을 때 여러 개의 receptive filter를 거치고 다시 하나로 합쳐짐

1 $\times$ 1 convolution 사용해서 파라미터 수 감소
- 1 $\times$ 1 convolution -> channel 방향으로 dimension 줄이는 효과
- 아래 예시의 경우 거의 30% 정도로 파라미터 수 감소한다.

ResNet

깊은 Neural network는 학습이 어렵다 -> 오히려 성능이 떨어진다.
- 너무 많은 파라미터 수로 인해 Overfitting 발생

Skip connection

ResNet은 identity map(skip connection)을 활용해 이 문제를 해결했다.
- layer output에 layer input을 더해 준다. ( $f(x)$ -> $x + f(x)$ )

skip connection 덕분에 layer가 늘어나면 error가 감소하는 효과
-> layer를 더 깊게 쌓을 수 있게 된 계기

1 $\times$ 1 convolution 사용해서 input과 output의 채널 맞춰줘야 함
-> projected shortcut (자주 사용하진 않음)
Batch normalization은 convolution 연산 뒤에 일어남

Bottleneck architecture

1 $\times$ 1 convolution 활용해서 input channel 줄임
-> 파라미터 수를 줄이는 효과
3 $\times$ 3 convolution 연산 후 input channel 다시 늘림 ( 1 $\times$ 1 convolution 활용)

파라미터 수를 줄이고 성능은 더 향상시킬 수 있게 됨

DenseNet

Skip connection에서 addition 대신 concatenation 사용

각 layer의 output이 점점 누적되는 효과
-> channel이 점점 커지는 문제점 발생
채널을 줄여줌으로서 channel 커지는 문제점 해결

Block 종류

Dense Block
- 각 layer는 모든 이전 layer들의 feature map을 concatenate 한다.
- channel 수는 점진적으로 증가한다.
Transition Block
- BatchNorm -> 1 $\times$ 1 conv -> 2 $\times$ 2 AvgPooling
- 채널 수를 줄인다. (1 $\times$ 1 conv 활용)

Summary

각 모델 별 주요 요소
- VGG : 3 $\times$ 3 conv block 반복
- GoogLeNet : 1 $\times$ 1 convolution 활용
- ResNet : skip-connection
- DenseNet : concatenation

이전 포스트

DL Convolution

다음 포스트

Computer Vision Applications

0개의 댓글