LeNet 구현 (tf) 및 MLP와 비교

Henry Lee·2021년 8월 17일

👨🏻‍💻 colab base (both CPU and GPU runtime types)

LeNet 구현 & MLP, ConvNet 비교

Load Dataset : mnist

import tensorflow as tf

data = tf.keras.datasets.mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step

(train_X, train_y), (test_X, test_y) = data

print(train_X.shape)
print(train_y.shape)
print(test_X.shape)
print(test_y.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)

train_X = train_X.reshape((60000, 28, 28, 1))
test_X = test_X.reshape((10000, 28, 28, 1))

print(train_X.shape)
print(train_y.shape)
print(test_X.shape)
print(test_y.shape)

(60000, 28, 28, 1)
(60000,)
(10000, 28, 28, 1)
(10000,)

Modeling

MLP

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout

MLP = Sequential([
  Flatten(input_shape=(28, 28)),
  Dense(128, activation='relu'),
  Dropout(0.2),
  Dense(10, activation='softmax')
])

MLP.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

MLP.summary()

Model: "sequential"

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________

LeNet

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer, Conv2D, AveragePooling2D, Flatten, Dense, ZeroPadding2D

LeNet = Sequential([
                    InputLayer(input_shape=(28,28,1)),
                    ZeroPadding2D((2,2)),
                    Conv2D(6, 5, activation='sigmoid'),
                    AveragePooling2D(strides=2),
                    Conv2D(16, 5, activation='sigmoid'),
                    AveragePooling2D(strides=2),
                    Flatten(),
                    Dense(120, activation='sigmoid'),
                    Dense(84, activation='sigmoid'),
                    Dense(10, activation='softmax')
])

LeNet.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics='accuracy')

LeNet.summary()

Model: "sequential_1"

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
zero_padding2d (ZeroPadding2 (None, 32, 32, 1)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 28, 28, 6)         156       
_________________________________________________________________
average_pooling2d (AveragePo (None, 14, 14, 6)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 10, 10, 16)        2416      
_________________________________________________________________
average_pooling2d_1 (Average (None, 5, 5, 16)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 400)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 120)               48120     
_________________________________________________________________
dense_3 (Dense)              (None, 84)                10164     
_________________________________________________________________
dense_4 (Dense)              (None, 10)                850       
=================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0
_________________________________________________________________

ConvNet

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, Dense

ConvNet = Sequential([
                      Conv2D(32, 3, activation='relu', input_shape=(28,28,1)),
                      Flatten(),
                      Dense(128, activation='relu'),
                      Dense(10, activation='softmax')
])

ConvNet.compile('adam', 'sparse_categorical_crossentropy', 'accuracy')

ConvNet.summary()

Model: "sequential_2"

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_2 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
flatten_2 (Flatten)          (None, 21632)             0         
_________________________________________________________________
dense_5 (Dense)              (None, 128)               2769024   
_________________________________________________________________
dense_6 (Dense)              (None, 10)                1290      
=================================================================
Total params: 2,770,634
Trainable params: 2,770,634
Non-trainable params: 0
_________________________________________________________________

Inference

MLP

MLP.fit(train_X, train_y)
MLP.evaluate(test_X, test_y)

1875/1875 [==============================] - 6s 2ms/step - loss: 2.2490 - accuracy: 0.7611
313/313 [==============================] - 1s 2ms/step - loss: 0.5504 - accuracy: 0.8636

[0.5504131317138672, 0.8636000156402588]

LeNet

LeNet.fit(train_X, train_y)
LeNet.evaluate(test_X, test_y)

1875/1875 [==============================] - 33s 2ms/step - loss: 0.5518 - accuracy: 0.8357
313/313 [==============================] - 1s 2ms/step - loss: 0.2008 - accuracy: 0.9393

[0.20078687369823456, 0.939300000667572]

ConvNet

ConvNet.fit(train_X, train_y)
ConvNet.evaluate(test_X, test_y)

1875/1875 [==============================] - 5s 3ms/step - loss: 0.8377 - accuracy: 0.9289
313/313 [==============================] - 1s 2ms/step - loss: 0.1167 - accuracy: 0.9682

[0.11674761027097702, 0.9682000279426575]

Conclusion

Model	#Params	Acc	Remark
MLP	101,770	0.8636	DropOut = 0.2
LeNet	61,706	0.9393	Activation = Sigmoid
ConvNet	2,770,634	0.9682	Activation = Relu

Achievement

성능개선 3요소
- Wider : 필터 개수
  - GPU의 발달로 더 넓게 사용 가능해졌다. 이 예제에서는 ConvNet이 제일 Wide하다.
  - ConvNet (#filters = 32) : abt 60s CPU base -> abt 5s GPU base (COLAB)
- Deeper : 네트워크 개수
  - 이 예제에서는 LeNet이 제일 Deep하다.
- Bigger Resolution : 해상도 크기
  - 이번 예제에서는 다루지 않았다.
상기 3요소가 성능에 미치는 영향을 연구하고 적절히 믹스한 것이 EfficientNet 이다!