Attention is all you need, NeurIPS 2017
sequence modeling and transduction problem을 잘 푸는 것
Constraint of sequential computation
Scaled dot-product QKV multi-head attention
Positional Encoding
Evaluation on Translation task : BLEU
Comparing Training cost


source:
Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems. 2017.
https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf