AI_Tech(NLP) 에 있는 논문 정리

Leejaegun·2024년 9월 2일

0

논문리뷰

목록 보기

2/10

1. Tokenization

Byte-Pair Encoding tokenization
https://huggingface.co/learn/nlp-course/en/chapter6/5

2. Word Embedding

The Illustrated Word2vec
https://jalammar.github.io/illustrated-word2vec/
Word2vec(Distributed Representations of Words)
https://arxiv.org/abs/1310.4546
GloVe
https://aclanthology.org/D14-1162.pdf

3. RNN과 Language Modeling

The Unreasonable Effectiveness of Recurrent Neural Networks
https://karpathy.github.io/2015/05/21/rnn-effectiveness/

4. Exploding and Vanishing Gradient

The Exploding and Vanishing Gradients Problem in Time Series
https://medium.com/metaor-artificial-intelligence/the-exploding-and-vanishing-gradients-problem-in-time-series-6b87d558d22

5. LSTM과 GRU

Understanding LSTM Networks
https://colah.github.io/posts/2015-08-Understanding-LSTMs/

6. Seq2Seq with Attention

Sequence to Sequence Learning with Neural Networks
https://arxiv.org/abs/1409.3215
Effective Approaches to Attention-based Neural Machine Translation
https://arxiv.org/abs/1508.04025
Sparse is Enough in Scaling Transformers
https://openreview.net/pdf?id=-b5OSCydOMe

7. Transformer1

The Illustrated Transformer
https://jalammar.github.io/illustrated-transformer/
Attention is all you need
https://arxiv.org/abs/1706.03762

8. Transformer2

The Annotated Transformer
https://nlp.seas.harvard.edu/2018/04/03/attention.html
Group Normalization
https://openaccess.thecvf.com/content_ECCV_2018/papers/Yuxin_Wu_Group_Normalization_ECCV_2018_paper.pdf

9. Self-supervised Pre-training

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
https://arxiv.org/abs/1810.04805
XLNet: Generalized Autoregressive Pretraining for Language Understanding
https://arxiv.org/abs/1906.08237

10. Decoding를 통한 자연어생성

Foundations of NLP Explained Visually: Beam Search, How It Works
https://towardsdatascience.com/foundations-of-nlp-explained-visually-beam-search-how-it-works-1586b9849a24

11. etc

GPT1 : www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf
GPT2 : d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
GPT3 : arxiv.org/abs/2005.14165

Lee_AA

이전 포스트

AI_Tech(CV) 에 있는 논문 정리

다음 포스트

논문리뷰[1] - Attention is All you need

0개의 댓글

관련 채용 정보