U_Week_2_Day_9

유영재·2021년 8월 12일

부스트캠프 AI_Tech

목록 보기

9/30

수업 정리 　

강의 목록

[DL Basic] Sequential Models - RNN

Seqential Model

Naive sequence model

$p(x_{t}|x_{t-1},x_{t-2},...)$

Autoregressive model

$x_{t-r}$ : Fix the past timespan

$p(x_{t}|x_{t-1},...,x_{t-r})$

Markov model(First-order autoregressive model)

$\prod_{t=1}^T p(x_{t}|x_{t-1})$ : Easy to express the joint distribition

$p(x_{1},...,x_{T} = p(x_{T}|x_{T-1})p(x_{T}|x_{T-1})...p(x_{2}|x_{1})= \prod_{t=1}^T p(x_{t}|x_{t-1})$

Latent autoregressive model

$h_{t}$ : summary of the past

$\hat{x} = p(x_{t}|h_{t})$

$h_{t} = g(h_{t-1}, x_{t-1})$

Recurrent Neural Network

입력이 굉장히 많은 fully connected layer로 생각할 수 있다.

Short-term dependencies

Cannot well Long-term dependencies

이러한 이유 때문에 Vanilla RNN의 activation function으로 ReLU를 잘 사용하지 않음

Long Short Term Memory

Forget Gate : Decide which information to throw away

Input Gate : Decide which information to store in the cell state

Update cell : update the cell state

Ouput Gate : Make output using the updated cell state

Gated Recurrent Unit

Simler architecure with two gates(reset gate and update gate)

No cell state, just hidden state

[DL Basic] Sequential Models - Transformer

Transformer is the first sequence transduction model based entirely on attention

Self-Attention

동일한 입력도 주변 값에 따라 값이 다르다

*** 참고자료 ***

허민석님 'Attention is all you need' 리뷰 영상

Bahdanau-attn과 Self-attn 비교 영상

위키독스 딥러닝을 이용한 자연어처리 입문

과제

LSTM
- 강의 보면서 완료

Multi-headed Attention
- 강의 보면서 완료

피어세션 정리

RNN gate 로직 및 개념 정리

Output gate 수식 중 $h_{t}=o_{t}*tanh(C_{t})$ 에서 $tanh$ 를 사용하는 이유 -> 멘토님께 질문

github에 stackedit을 이용한 수식 입력방법

Q, K, V는 어떻게 만들어지고, 어떻게 작동하는가?

느낀점

낮에 이고잉님 git 특강을 들을 때만 해도 평화로웠는데 저녁에 선택과제 해설 오피스아워에서 나는 아직도 멀었구나,,, 한참 멀었구나,,,,를 느꼈다.

모방은 제 2의 창조다. 일단 해봐라 - 류영표 멘토님

유영재

이전 포스트

U_Week_2_Day_8

다음 포스트

U_Week_2_Day_9

부스트캠프 AI_Tech

수업 정리

강의 목록

[DL Basic] Sequential Models - RNN

[DL Basic] Sequential Models - Transformer

과제

피어세션 정리

느낀점

U_Week_2_Day_8

U_Week_2_Day_10

0개의 댓글