Conv2DTranspose padding = same시 output shape

HHHHH·2021년 2월 25일

개념정리

목록 보기

3/6

https://velog.io/@hayaseleu/Transposed-Convolutional-Layer은-무엇인가

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose
에서 정리한바와 같이 outputshape의 공식은

new_rows = ((rows - 1) strides[0] + kernel_size[0] - 2 padding[0] +
output_padding[0])

혹은

$o=(i−1)∗s+k−2p$

이 된다. 그러나 padding = "same"을 걸면 아주 간단하게
input size * stride = output size가 되도록 padding을 해준다.

def make_generator_model():

    # Start
    model = tf.keras.Sequential()

    # First: Dense layer
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    # Second: Reshape layer
    model.add(layers.Reshape((7, 7, 256)))

    # Third: Conv2DTranspose layer
    model.add(layers.Conv2DTranspose(128, kernel_size=(5, 5), strides=(2, 2), padding= 'same', use_bias=False))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    # Fourth: Conv2DTranspose layer
    model.add(layers.Conv2DTranspose(64, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    # Fifth: Conv2DTranspose layer
    model.add(layers.Conv2DTranspose(1, kernel_size=(5, 5), strides=(3, 3), padding='same', use_bias= False, \
                                     activation='tanh'))

    return model

Model: "sequential_15"

Layer (type) Output Shape Param #

dense_15 (Dense) (None, 12544) 1254400

batch_normalization_45 (Batc (None, 12544) 50176

leaky_re_lu_45 (LeakyReLU) (None, 12544) 0

reshape_15 (Reshape) (None, 7, 7, 256) 0

conv2d_transpose_45 (Conv2DT (None, 14, 14, 128) 819200

batch_normalization_46 (Batc (None, 14, 14, 128) 512

leaky_re_lu_46 (LeakyReLU) (None, 14, 14, 128) 0

conv2d_transpose_46 (Conv2DT (None, 28, 28, 64) 204800

batch_normalization_47 (Batc (None, 28, 28, 64) 256

leaky_re_lu_47 (LeakyReLU) (None, 28, 28, 64) 0

conv2d_transpose_47 (Conv2DT (None, 84, 84, 1) 1600

=================================
Total params: 2,330,944
Trainable params: 2,305,472
Non-trainable params: 25,472

즉 이 경우는 (7-1) * 2 + 5 - 2p = 14가 되어야하는데
정리하면 3 = 2p가 되는데 p는 정수여양하는데 p가 1.5로 나와 혼동할수있다.
이는 padding = "same"을 했을때는 padding을 좌우 불균형하게 둘수도 있기 때문이다.

예를 들어 이런식으로
If you like ascii art:

"VALID" = without padding:

inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13)
|____| dropped
|_|
"SAME" = with zero padding:

           pad|                                      |pad

inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0
|____|
|_|
|____|
In this example:

Input width = 13
Filter width = 6
Stride = 5
Notes:

"VALID" only ever drops the right-most columns (or bottom-most rows).
"SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).

출처

결론 padding = "same"을 걸때는 그냥 input size와 stride의 곱이 output size가 된다. 알아서 input size * stride = output size가 되도록 padding을 해주고 padding을 좌우 불균형하게 해줄수있기 때문에 전에서 언급한 공식은 성립하지 않을 수도 있다.

HHHHH

공부중

이전 포스트

Conv2DTranspose padding = same시 output shape

개념정리

Layer (type) Output Shape Param #

컴퓨터 비전, learning, deep learning 정의

0개의 댓글

관련 채용 정보