https://velog.io/@hayaseleu/Transposed-Convolutional-Layer은-무엇인가
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose
에서 정리한바와 같이 outputshape의 공식은
new_rows = ((rows - 1) strides[0] + kernel_size[0] - 2 padding[0] +
output_padding[0])
혹은
이 된다. 그러나 padding = "same"을 걸면 아주 간단하게
input size * stride = output size가 되도록 padding을 해준다.
def make_generator_model():
# Start
model = tf.keras.Sequential()
# First: Dense layer
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
# Second: Reshape layer
model.add(layers.Reshape((7, 7, 256)))
# Third: Conv2DTranspose layer
model.add(layers.Conv2DTranspose(128, kernel_size=(5, 5), strides=(2, 2), padding= 'same', use_bias=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
# Fourth: Conv2DTranspose layer
model.add(layers.Conv2DTranspose(64, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
# Fifth: Conv2DTranspose layer
model.add(layers.Conv2DTranspose(1, kernel_size=(5, 5), strides=(3, 3), padding='same', use_bias= False, \
activation='tanh'))
return model
Model: "sequential_15"
dense_15 (Dense) (None, 12544) 1254400
batch_normalization_45 (Batc (None, 12544) 50176
leaky_re_lu_45 (LeakyReLU) (None, 12544) 0
reshape_15 (Reshape) (None, 7, 7, 256) 0
conv2d_transpose_45 (Conv2DT (None, 14, 14, 128) 819200
batch_normalization_46 (Batc (None, 14, 14, 128) 512
leaky_re_lu_46 (LeakyReLU) (None, 14, 14, 128) 0
conv2d_transpose_46 (Conv2DT (None, 28, 28, 64) 204800
batch_normalization_47 (Batc (None, 28, 28, 64) 256
leaky_re_lu_47 (LeakyReLU) (None, 28, 28, 64) 0
conv2d_transpose_47 (Conv2DT (None, 84, 84, 1) 1600
=================================
Total params: 2,330,944
Trainable params: 2,305,472
Non-trainable params: 25,472
즉 이 경우는 (7-1) * 2 + 5 - 2p = 14가 되어야하는데
정리하면 3 = 2p가 되는데 p는 정수여양하는데 p가 1.5로 나와 혼동할수있다.
이는 padding = "same"을 했을때는 padding을 좌우 불균형하게 둘수도 있기 때문이다.
예를 들어 이런식으로
If you like ascii art:
"VALID" = without padding:
inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13)
|____| dropped
|_|
"SAME" = with zero padding:
pad| |pad
inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0
|____|
|_|
|____|
In this example:
Input width = 13
Filter width = 6
Stride = 5
Notes:
"VALID" only ever drops the right-most columns (or bottom-most rows).
"SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).
결론 padding = "same"을 걸때는 그냥 input size와 stride의 곱이 output size가 된다. 알아서 input size * stride = output size가 되도록 padding을 해주고 padding을 좌우 불균형하게 해줄수있기 때문에 전에서 언급한 공식은 성립하지 않을 수도 있다.