CNN-Convolution

Ko Hyejung·2021년 12월 11일

NAVER AI TECH - precourse

목록 보기

14/15

CNN(Convolutional Neural Network)에서 가장 중요한 연산은 Convolution 입니다.

CNN에 대한 공부를 하기 전에 Convolution의 정의, convolution 연산 방법과 기능에 대해 배웁니다.

그리고 Convolution, 입력을 축소하는 Pooling layer, 모든 노드를 연결하여 최종적인 결과를 만드는 Fully connected layer 로 구성되는 기본적인 CNN(Convolutional Neural Network) 구조에 대해 배웁니다.

Convolution

RGB Image Convolution

Stack of Convolutions

Convolutional Neural Networks

CNN consists of convolution layer, pooling layer, and fully connected layer.
Convolution and pooling layers: feature extraction

Feature extraction

Feature extraction consists of using the representations learned by a previous network to extract interesting features from new samples. These features are then run through a new classifier, which is trained from scratch.

convonets

from keras import layers 
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu')) model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

Fully connected layer: decision making (e.g., classification)

Convolution Arithmetic (of GoogLeNet)

Stride

3 × 3 convolution patches with 2 × 2 strides

Padding

Padding consists of adding an appropriate number of rows and columns on each side of the input feature map so as to make it possible to fit center convolution windows around every input tile.

For a 3 × 3 window, you add one column on the right, one column on the left, one row at the top, and one row at the bottom. For a 5 × 5 window, you add two rows

Padding a 5 × 5 input in order to be able to extract 25 3 × 3 patches

Stride? Padding?

Convolution Arithmetic

Padding (1), Stride (1), 3 × 3 Kernel

What is the number of parameters of this model?
The answer is 3 x 3 x 128 x 64 = 73,728

What is the number of parameters of this model?
11 x 11 x 3 x 48 x 2 := 35k
5 x 5 x 48 x 128 x 2 := 307k
3 x 3 x 128 x 2 x 192 x 2 := 884k
3 x 3 x 192 x 192 x 2 := 663k
3 x 3 x 192 x 128 x 2 := 442k

dense layers
13 x 13 x 128 x 2 x 2048 x 2 := 177M
2048 x 2 x 2048 x 2 := 16M
2048 x 2 x 1000 := 4M

1x1 Convolution

Dimension reduction
To reduce the number of parameters while increasing the depth
e.g., bottleneck architecture

How convolution works

Consider a 5 × 5 feature map (25 tiles total)
There are only 9 tiles around which you can center a 3 × 3 window, forming a 3 × 3 grid

If you start with 28 × 28 inputs, which become 26 × 26 after the first convolution layer

max pooling operation

model_no_max_pool = models.Sequential()
model_no_max_pool.add(layers.Conv2D(32, (3, 3), activation='relu',
                      input_shape=(28, 28, 1)))
model_no_max_pool.add(layers.Conv2D(64, (3, 3), activation='relu'))
model_no_max_pool.add(layers.Conv2D(64, (3, 3), activation='relu'))