ํ˜ผ๊ณต ML+DL #24

myeongยท2022๋…„ 11์›” 19์ผ
0

ML+DL

๋ชฉ๋ก ๋ณด๊ธฐ
22/23

๐Ÿ“Œ IMDB ์˜ํ™” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ ๋ถ„์„

  • ๊ฐ์„ฑ๋ถ„์„
  • ๊ธ์ •/๋ถ€์ • text data set ๊ฐ 25000๊ฐœ
  • ์ด์ง„๋ถ„๋ฅ˜

๐Ÿ“ Data ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

  • 500๊ฐœ ๋‹จ์–ด๋งŒ ์‚ฌ์šฉ
  • Text -> ์ •์ˆ˜
from tensorflow.keras.datasets import imdb

(train_input, train_target), (test_input, test_target) = imdb.load_data(
    num_words=500)
    
print(train_input.shape, test_input.shape)
print(train_input[0])
print(train_target[:20])

(25000,) (25000,)
[1, 14, 22, 16, 43, 2, 2, 2, 2, 65, 458, ...
-> '1' : ์ƒ˜ํ”Œ์˜ ์‹œ์ž‘ ๋ถ€๋ถ„์„ ์˜๋ฏธ // '2' : 500๊ฐœ ๋‹จ์–ด์— ํฌํ•จ๋˜์ง€ ์•Š์€ ๋‹จ์–ด
[1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 1 1 0 1]

๐Ÿ“ Data Set

  • ์ƒ˜ํ”Œ์˜ ๊ธธ์ด๊ฐ€ ์ฒœ์ฐจ๋งŒ๋ณ„ (๋‹จ์–ด ๊ฐœ์ˆ˜)
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt

train_input, val_input, train_target, val_target = train_test_split(
    train_input, train_target, test_size=0.2, random_state=42
)
lengths = np.array([len(x) for x in train_input])
print(np.mean(lengths), np.median(lengths))

plt.hist(lengths)
plt.xlabel('length')
plt.ylabel('frequency')
plt.show()

237.9088125 179.0 // ํ‰๊ท ๊ฐ’ > ์ค‘๊ฐ„๊ฐ’


๐Ÿ“ ์‹œํ€€์Šค ํŒจ๋”ฉ

  • ๋ฌธ์žฅ์˜ ๊ธธ์ด๋ฅผ ์ง€์ •ํ•ด์„œ ์ž๋ฅด๊ฑฐ๋‚˜ ๋นˆ ๊ณณ์€ 0์œผ๋กœ ํŒจ๋”ฉ
    (์ผ๋ฐ˜์ ์œผ๋กœ ๋ฌธ์žฅ์˜ ์•ž๋ถ€๋ถ„์„ ์ž๋ฅด๊ฑฐ๋‚˜ ํŒจ๋”ฉํ•จ)

  • ํ† ํฐ = ๋‹จ์–ด ๊ฐœ์ˆ˜ = ํƒ€์ž„์Šคํ… = 100

from tensorflow.keras.preprocessing.sequence import pad_sequences
train_seq = pad_sequences(train_input, maxlen=100)
val_seq = pad_sequences(val_input, maxlen=100)
print(train_seq.shape)

(20000, 100) // ์ƒ˜ํ”Œ, ํ† ํฐ



๐Ÿ“Œ ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ

๐Ÿ“ ๋ชจ๋ธ ์ƒ์„ฑ

from tensorflow import keras
model = keras.Sequential()
model.add(keras.layers.SimpleRNN(8, input_shape=(100, 500)))
model.add(keras.layers.Dense(1, activation='sigmoid'))
  • ์› ํ•ซ ์ธ์ฝ”๋”ฉ : 500๊ฐœ vector -> ์›์†Œ ํ•˜๋‚˜๋งŒ 1, ๋‚˜๋จธ์ง€ 0
  • 20000x100์ธ train_seq -> 20000x100x500
train_oh = keras.utils.to_categorical(train_seq)
val_oh = keras.utils.to_categorical(val_seq)

print(train_oh.shape)
print(train_oh[0][0][:12])
print(np.sum(train_oh[0][0]))

(20000, 100, 500)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
1.0 // ์›์†Œ ํ•˜๋‚˜๋งŒ 1

  • RNN Param
    500๊ฐœ ํ† ํฐ(์›ํ•ซ) x 8๊ฐœ ๋‰ด๋Ÿฐ ์™„์ „์—ฐ๊ฒฐ
    +8๊ฐœ ์ˆœํ™˜๋˜๋Š” ์€๋‹‰์ƒํƒœ h x 8๊ฐœ ๋‰ด๋Ÿฐ ์™„์ „์—ฐ๊ฒฐ
    +8๊ฐœ ์ ˆํŽธ
    = 4072

  • Dense Param
    8๊ฐœ ๋‰ด๋Ÿฐ x 1๊ฐœ ๊ฐ€์ค‘์น˜
    +1๊ฐœ ์ ˆํŽธ
    = 9

๐Ÿ“ ๋ชจ๋ธ ํ›ˆ๋ จ

  • RMSprop ์˜ตํ‹ฐ๋งˆ์ด์ € : learning_rate=1e-4
  • ์ด์ง„๋ถ„๋ฅ˜ ์†์‹คํ•จ์ˆ˜ : binary_crossentropy
  • ์ฝœ๋ฐฑํ•จ์ˆ˜ : Checkpoint, EarlyStopping
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model.compile(optimizer=rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-simplernn-model.h5', save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)

history = model.fit(train_oh, train_target, epochs=100, batch_size=64,
                    validation_data=(val_oh, val_target),
                    callbacks=[checkpoint_cb, early_stopping_cb])
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

-> 80% ์ •ํ™•๋„
-> ๋ฐ์ดํ„ฐ๊ฐ€ ๋Š˜์–ด๋‚ ์ˆ˜๋ก ๋ฉ”๋ชจ๋ฆฌ ๋‚ญ๋น„ ์‹ฌํ•จ

๐Ÿ“ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ์‚ฌ์šฉ

  • ๋‹จ์–ด๋ฅผ ์‹ค์ˆ˜ ๋ฒกํ„ฐ๋กœ ์ž„๋ฒ ๋”ฉ
  • ๊ฑฐ๋ฆฌ ๊ณ„์‚ฐ์œผ๋กœ ๋‹จ์–ด ๋น„๊ต ๊ฐ€๋Šฅ
  • ์ž…๋ ฅ ์ฐจ์› 100 -> ์ถœ๋ ฅ ์ฐจ์› 16 (ํ† ํฐ ํ•˜๋‚˜ ๋‹น)
model2 = keras.Sequential()

model2.add(keras.layers.Embedding(500, 16, input_length=100))
model2.add(keras.layers.SimpleRNN(8))
model2.add(keras.layers.Dense(1, activation='sigmoid'))

model2.summary()

  • Embedding Param
    100๊ฐœ ํ† ํฐ -> 16 ์ฐจ์› ์‹ค์ˆ˜ ๋ฒกํ„ฐ
    x ๊ฐ€์ค‘์น˜ 500
    = 8000

  • SimpleRNN Param
    16๊ฐœ ์ถœ๋ ฅ x 8๊ฐœ ๋‰ด๋Ÿฐ
    +8๊ฐœ ์€๋‹‰์ƒํƒœ x 8๊ฐœ ๋‰ด๋Ÿฐ
    +8๊ฐœ ์ ˆํŽธ
    = 200

-> ๋น„์Šทํ•œ ๊ฒฐ๊ณผ



๐Ÿ”— ํ˜ผ๊ณต MLDL-24

0๊ฐœ์˜ ๋Œ“๊ธ€