CNN 입/출력 데이터 및 파라미터 계산

GRoovAllstar·2022년 11월 3일
0
  • Layer 별 입력 데이터에 대한 출력 Feature Map 산정 계산법
    • 입력 데이터 높이 : H
    • 입력 데이터 폭 : W
    • 필터 높이 : FH
    • 필터 폭 : FW
    • Stride 크기 : S
    • Padding 크기 : P
  • 출력 데이터 크기 계산식
    OutputHeight=(H+2P)FHS+1Output Height = \frac{(H+2P)-FH}{S}+1
    OutputWidth=(H+2P)FWS+1Output Width = \frac{(H+2P)-FW}{S}+1


Input Shape: (7, 7, 3) Output Shape : (5, 5 ,4) K : (3, 3) P : (0, 0) S : (1, 1) Filter : 4

OutputHeight=(7+20)31+1=5Output Height = \frac{(7+2*0)-3}{1}+1 = 5
OutputWidth=(7+20)31+1=5Output Width = \frac{(7+2*0)-3}{1}+1 = 5

(출처 - https://miro.medium.com/max/1100/1*ubRrYAZJUlCcqg7WoKjLgQ.gif)
kernel width/height가 다를경우
Input Shape: (7, 9, 3) Output Shape : (3, 8, 2) K : (5, 2) P : (0, 0) S : (1, 1) Filter : 2

OutputHeight=(7+20)51+1=3Output Height = \frac{(7+2*0)-5}{1}+1 = 3
OutputWidth=(9+20)21+1=8Output Width = \frac{(9+2*0)-2}{1}+1 = 8

(출처 - https://miro.medium.com/max/1100/1*EnIGiVTcIMQm9ujkOHPc5A.gif)
stride가 다를 경우
Input Shape: (9, 9, 3) Output Shape : (7, 3, 2) K : (3, 3, 2) P : (0, 0) S : (1, 3) Filter : 2

OutputHeight=(9+20)31+1=7Output Height = \frac{(9+2*0)-3}{1}+1 = 7
OutputWidth=(9+20)23+1=3Output Width = \frac{(9+2*0)-2}{3}+1 = 3

(출처 - https://miro.medium.com/max/1100/1*o9-Rq3QUC8IzTMfAJIbhLA.gif)

Input Shape : (7, 7, 2) Output Shape : (7, 7, 1) K : (3, 3) P : (1, 1) S : (1, 1) Filter : 1

OutputHeight=(7+21)31+1=7Output Height = \frac{(7+2*1)-3}{1}+1 = 7
OutputWidth=(7+21)31+1=7Output Width = \frac{(7+2*1)-3}{1}+1 = 7

(출처 - https://miro.medium.com/max/1100/1*EnIGiVTcIMQm9ujkOHPc5A.gif)

  • CNN Layer의 입/출력, 파라미터 계산

    • 파라미터 계산식
      InputChannelKernelWidthkernelHeightOutputChannel+Bias(Filters)Input Channel * Kernel Width * kernel Height * Output Channel + Bias(Filters)
    • Output Channel은 Filters 값
  • Pooling Layer 계산

    OutputRowSize=InputRowSizePoolingSizeOutput Row Size = \frac{Input Row Size}{Pooling Size}
    OutputColumnSize=InputColumnSizePoolingSizeOutput Column Size = \frac{Input Column Size}{Pooling Size}
  • BatchNormalization 파라미터 계산

    • 현재 gamma, beta, mean, standard deviation 총 4개의 파라미터를 구하도록 구현되어 있음.
    • https://github.com/keras-team/keras/issues/1523
      • gamma : 스케일링(scaling) 파라미터, beta : shift 파라미터.
      • mean, standard deviation은 Non-trainable params.
  • 모델

IMAGE_SIZE = 32
input_tensor = Input(shape=(IMAGE_SIZE, IMAGE_SIZE, 3), name='input')

x = Conv2D(filters=64, kernel_size=(3, 3), padding='same', name='conv2d_1')(input_tensor)
x = BatchNormalization(name='bn_1')(x)
x = Activation('relu', name='activation_1')(x)
x = MaxPooling2D(pool_size=2)(x)

x = Conv2D(filters=128, kernel_size=3, padding='same', name='conv2d_2')(x)
x = BatchNormalization(name='bn_2')(x)
x = Activation('relu', name='activation_2')(x)
x = MaxPooling2D(pool_size=2)(x)

x = Conv2D(filters=256, kernel_size=3, strides=2, padding='same', name='conv2d_3')(x)
x = BatchNormalization(name='bn_3')(x)
x = Activation('relu', name='activation_3')(x)
x = MaxPooling2D(pool_size=2)(x)

x = Conv2D(filters=512, kernel_size=3, strides=2, padding='same', name='conv2d_4')(x)
x = BatchNormalization(name='bn_4')(x)
x = Activation('relu', name='activation_4')(x)

x = GlobalAveragePooling2D()(x)
x = Dropout(rate=0.5)(x)
x = Dense(50, activation='relu', name='fc')(x)
x = Dropout(rate=0.5)(x)
output = Dense(10, activation='softmax', name='output')(x)

model = Model(inputs=input_tensor, outputs=output)
  • 계산 결과
LayerInput ShapeOutput ShapeParameters
Convolution Layer 1(32+21-3)/1+1 = 32, (32+21-3)/1+1 = 32, 332, 32, 64133*64+64=1792
Batch Normalization Layer 132, 32, 6432, 32, 6464*4 = 256
MaxPooling Layer 132, 32, 6432/2=16, 32/2=16, 640
Convolution Layer 2(16+21-3)/1+1 = 16, (16+21-3)/1+1 = 16, 12816, 16, 1286433*128+128=73856
Batch Normalization Layer 216, 16, 12816, 16, 128128*4 = 512
MaxPooling Layer 216, 16, 12816/2 = 8,16/2 = 8,1280
Convolution Layer 3(8+21-3)/2+1 = 4, (8+21-3)/2+1 = 4, 2564, 4, 25612833*256+256=295168
Batch Normalization Layer 34, 4, 2564, 4, 256256*4 = 1024
MaxPooling Layer 34, 4, 2564/2=2, 4/2=2, 2560
Convolution Layer 4(2+21-3)/2+1 = 1,(2+21-3)/2+1 = 1,5121, 1, 51225633*512+512=1180160
Batch Normalization Layer 41, 1, 5121, 1, 512512*4 = 2048
Dropout Layer 1None, 512None, 5120
Fully Connected Layer 1None, 50None, 50512*50+50=25650
Dropout Layer 2None, 50None, 500
OutputNone, 50None, 1050*10+10 = 510

ref. https://towardsdatascience.com/conv2d-to-finally-understand-what-happens-in-the-forward-pass-1bbaafb0b148
ref. http://taewan.kim/post/cnn/

profile
Keep on eye on the future :)

1개의 댓글

comment-user-thumbnail
2023년 12월 26일

The input data for a CNN is typically represented as a multi-dimensional array known as a tensor. For images, this tensor is usually in the form of width x height x channels, where the width and height represent the dimensions of the image, and the channels represent color channels
RTS TV APK

답글 달기