시리즈

2nd semester 2022 NLP study

1.Attention is All You Need(2017)

I. Introduction RNN, LSTM, Gated RNN are SOTA models in language modeling and machine translation factor computation along positions of input and outp

2022년 7월 21일

2.Efficient Estimation of Word Representations in Vector Space (2013) a.k.a. Word2Vec

당시의 많은 NLP task approach는 vocabulary 내의 word를 가장 기초적(atomic)인 단위로 간주 \- simplicity, robustness, 간단한 모델 & 많은 데이터만 있으면 적은 데이터 & 복잡한 모델보다 outperform \-

2022년 8월 17일

3.Deep contextualized word representations (2018) a.k.a. ELMO

Abstract new type of deep contextualized word representation models complex characteristics of word use(syntax, semantics) models how word uses vary

2022년 8월 31일

4.Character-Aware Neural Language Models (2015) a.k.a. Character CNN

(1) word-based embedding(word2vec, GloVe) (2) character-based embedding(FastText) (3) character-based w/ word-based embedding(Character-Aware Neural L

2022년 9월 15일

5.Bidirectional LSTM-CRF Models for Sequence Tagging(NAACL 2016)

bi-LSTM + Conditional Random Field(CRF)을 sequence tagging에 적용한 첫 번째 연구bi-LSTM을 이용해 과거와 미래의 정보를 모두 이용하고, CRF를 이용해 문장 전체의 tag information을 이용할 수 있음bi-LS

2022년 9월 29일

6.Bi-Directional Attention Flow for Machine Comprehension (2017) a.k.a. BiDAF

0. Abstract Machine Comprehension(MC)은 Machine QA와 같음: context와 query간 복잡한 interaction 필요 최근(2017)까지 attention은 MC에도 많이 확장되었는데, context의 작은 부분에 집중하여

2022년 11월 7일

7.Get to the Point: Summarization with Pointer-Generation Networks (ACL, 2017) a.k.a. Pointer-Generator

Abstract 기존 abstractive summarization의 단점: factual details inaccurately repeat themselves 새로운 모델: seq2seq attentional model hybrid pointer-gener

2022년 11월 9일

8.REALM: Retrieval-Augmented Language Model Pre-Training (ICML, 2020)

Pretrained LM이 "world knowledge"를 잘 포착해냄으로써 NLP task에서 굉장히 좋은 퍼포먼스를 보이는 것은 사실임그러나 위와 같은 "world knowledge"는 모델 내부에(implicitly) NN의 파라미터로서 저장되어 있고, 더 많은

2022년 11월 24일