๐Ÿšฉ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ดˆ - part22. RNN ์ดํ•ดํ•˜๊ธฐ (feat. ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์„ ์™œ ์“ธ๊นŒ? - Recurrent Neural Network)

vincaยท2022๋…„ 12์›” 15์ผ
0

๐ŸŒ“ AI/DL - theory

๋ชฉ๋ก ๋ณด๊ธฐ
23/24

Introduction

RNN ์ฆ‰, ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์„ ์™œ ์“ฐ๋Š” ์ง€ ์•Œ์•„๋ณด๊ณ , ์ดํ•ดํ•ด๋ณด์ž.

Recurrent Neural Network(RNN)์ด๋ž€?

  • ์ด์ „ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ์–ตํ•˜์—ฌ ๋‹ค์Œ ๋ฐ์ดํ„ฐ์˜ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์–ด ์ถœ๋ ฅ์— ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ๋Š” ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ
  • ์ฃผ๋กœ ์‹œํ€ธ์Šค ๋ฐ์ดํ„ฐ(๋ฌธ์žฅ)๋‚˜ ์—ฐ์†์ ์ธ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์— ์“ด๋‹ค.

RNN์‚ฌ์šฉ ์ด์œ 

What..? ๋‹ค๋ฅธ ๋ชจ๋ธ๊ณผ ๋ญ๊ฐ€ ๋‹ค๋ฅธ๋ฐ?

์ผ๋ฐ˜์ ์ธ CNN ๋ชจ๋ธ์„ ์ƒ๊ฐํ•ด๋ณด์ž. ํ•ด๋‹น ํ•ฉ์„ฑ๊ณฑ layer์˜ ์ปค๋„๋งŒ ํ•™์Šต๋  ๋ฟ ํ•ด๋‹น ์ปค๋„์ด ํ•™์Šต๋œ ๊ฒฐ๊ณผ๊ฐ€, ๋‹ค์Œ ์ปค๋„์˜ ํ•™์Šต์— ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š”๋‹ค.

FC ๋ชจ๋ธ ๋˜ํ•œ ๋™์ผํ•˜๋‹ค. ๋˜ํ•œ ์ด์ „ ๋…ธ๋“œ์™€ ๋…ธ๋“œ์‚ฌ์ด์˜ feature๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” weight๋Š” ๋‹ค์Œ layer์˜ weight๋ฅผ ๊ฒฐ์ •ํ•˜๋Š”๋ฐ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š๋Š”๋‹ค.

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์˜ ํŠน์ง•

์žฅ์ 

  • RNN์€ ๋ชจ๋ธ์ด ์ด๋ก ์ ์œผ๋กœ ๋ชจ๋ธ์ด ๊ฐ„๋‹จํ•˜๊ณ  ์–ด๋–ค ๊ธธ์ด์˜ sequential ๋ฐ์ดํ„ฐ๋ผ๋„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

  • ๋˜ํ•œ ์ด์ „ ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค.
    "์ฒ ์ˆ˜๊ฐ€ ์ˆŸ๊ฐ€๋ฝ์„ ๋“ค๊ณ  ๋ฐฅ์„ ํŽ๋‹ค." ๋ผ๋Š” ๋ฌธ์žฅ์ด ์žˆ์„ ๋•Œ, ์ด์ „ ์ •๋ณด์ธ ์ฒ ์ˆ˜๊ฐ€ + ์ˆŸ๊ฐ€๋ฝ์„ + ๋“ค๊ณ  + ๋ฐฅ์„ ์ด๋ผ๋Š” ์ด์ „ ์ •๋ณด๋ฅผ ํ†ตํ•ด ํŽ๋‹ค๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋‹ค.

๋‹จ์ 

  • ํ•˜์ง€๋งŒ ์ฒ ์ˆ˜๊ฐ€ ์™€ ํŽ๋‹ค ๊ด€๊ณ„์™€ ๊ฐ™์ด ํ˜„์žฌ ๋…ธ๋“œ ์œ„์น˜์™€ ๋จผ ์ƒํƒœ๋ฅผ ์‚ฌ์šฉํ•œ ๋ฌธ๋งฅ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฒƒ์€ ์ƒ๋Œ€์ ์œผ๋กœ ์–ด๋ ต๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค.

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ


์ด์ „ ์ƒํƒœ์™€ input ๋‘๊ฐ€์ง€๊ฐ€ ํ•จ๊ป˜ ๋“ค์–ด๊ฐ€ ๋™์ž‘ํ•œ๋‹ค.

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์˜ ๋™์ž‘



simpleRNN ์˜ˆ์ œ

rnn = layers.SimpleRNN(units=hidden_size,return_sequences=True, return_state=True)
# One hot encoding for each char in 'hello'
h = [1, 0, 0, 0]
e = [0, 1, 0, 0]
l = [0, 0, 1, 0]
o = [0, 0, 0, 1]
x_data = np.array([[h]], dtype=np.float32)
hidden_size = 2
cell = SimpleRNNCell(units=hidden_size)
rnn = RNN(cell, return_sequences=True, return_state=True)
outputs, states = rnn(x_data)
print("ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜:", rnn.count_params())
print("weights:", rnn.weights)
print('x_data: {}, shape: {}'.format(x_data, x_data.shape))
print('outputs: {}, shape: {}'.format(outputs, outputs.shape))
print('states: {}, shape: {}'.format(states, states.shape))

๊ฒฐ๊ณผ

ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐœ์ˆ˜: 14
weights:
[<tf.Variable 'rnn_3/simple_rnn_cell_3/kernel:0' shape=(4, 2) dtype=float32, numpy=
array([[-0.8238611 , 0.26491618],
[ 0.8516624 , -0.05171776],
[ 0.6305232 , -0.4280374 ],
[ 0.20209384, -0.00895429]], dtype=float32)>,
<tf.Variable 'rnn_3/simple_rnn_cell_3/recurrent_kernel:0' shape=(2, 2) dtype=float32, numpy=
array([[-0.8955449 , -0.44497135],
[ 0.44497135, -0.8955448 ]], dtype=float32)>,
<tf.Variable 'rnn_3/simple_rnn_cell_3/bias:0' shape=(2,) dtype=float32, numpy=
array([0., 0.], dtype=float32)>]

์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์˜ ์œ ํ˜•

1.Many-To-One

๋‹ค์ˆ˜ ์ž…๋ ฅ ๋‹จ์ผ ์ถœ๋ ฅ
๋ฌธ์žฅ์„ ์ฝ๊ณ  ๋œป์„ ํŒŒ์•…ํ•  ๋•Œ ํ™œ์šฉํ•œ๋‹ค. ex.) ๊ฐ์ • ๋ถ„์„

๋ฐฅ์€, ๋จน๊ณ , ๋‹ค๋‹ˆ๋‹ˆ๋ผ๋Š” ๋‹ค์ˆ˜ ๋ฌธ์žฅ์„ ํ†ตํ•ด "์•ˆ๋ถ€ ์ธ์‚ฌ"๋ผ๋Š” ๋œป์„ ํŒŒ์•…ํ•œ๋‹ค.

2.One-To-Many

๋‹จ์ผ ์ž…๋ ฅ ๋‹ค์ˆ˜ ์ถœ๋ ฅ
์‚ฌ์ง„์˜ ์บก์…˜์ž‘์—…์„ ํ•  ๋•Œ ์‚ฌ์šฉ๋œ๋‹ค.

ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€๋ฅผ ํ†ตํ•ด ํผ๊ทธ, ์ด๋ถˆ, ์š”๋‹ค ๋ผ๋Š” ์บก์…˜์„ ๅคšํ•˜๊ฒŒ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

3.Many-To-Many 1 : ๊ฐ ์ž…๋ ฅ ๋ถ„์„

๊ฐ ์ž…๋ ฅ์— ๋Œ€ํ•œ ํ˜•ํƒœ์†Œ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์ด ํ•ด๋‹น๋œ๋‹ค.

4.Many-To-Many 2 : ์ „์ฒด ๋ถ„์„

๋ฒˆ์—ญ์ด ์ด์— ํ•ด๋‹น๋œ๋‹ค. ์ž…๋ ฅ์„ ์ „๋ถ€ ๋ฐ›๊ณ  ์ถœ๋ ฅ์„ ํ•œ๋ฒˆ์— ํ•œ๋‹ค.
๋ฌธ์žฅ์˜ ๊ตฌ์„ฑ์š”์†Œ๊ฐ€ ๋‹ค๋ฅด๋ฏ€๋กœ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ๋‹ค.

Staking ์ด๋ž€?

Layer(์ธต)์„ ์—ฌ๋Ÿฌ๊ฐœ ์‚ฌ์šฉํ•˜์—ฌ ๋” ๋ณต์žกํ•œ ๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ ํ•ด๊ฒฐ์ด ๊ฐ€๋Šฅํ•Ÿ๋กœ๊ณ  ํ•˜๋Š” ๊ฒƒ.

Bidirectional ์ด๋ž€?

์ˆœ์ฐจ์  ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•ด์„œ ์ด์ „ ๋ฐ์ดํ„ฐ์™€์˜ ๊ด€๊ณ„ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ดํ›„ ๋ฐ์ดํ„ฐ์™€์˜ ๊ด€๊ณ„ ๊ฐ€์ง€ ํ•™์Šตํ•˜๋Š” ๊ฒƒ.

profile
๋ถ‰์€ ๋ฐฐ ์˜ค์ƒ‰ ๋”ฑ๋‹ค๊ตฌ๋ฆฌ ๊ฐœ๋ฐœ์ž ๐ŸฆƒCloud & DevOps

0๊ฐœ์˜ ๋Œ“๊ธ€