What is the "Attention"? : bahdanau, luong
Review paper "Attention is all you need"
Llama 2 : analyze model and review code.
About Llama Factory...
About LoRA.. Thx "lightning.ai"
Llama 3, The First Review! Introduction.
Llama 3, The Second Review! Pre-Training
DeepSeek-R1 Review.
Understanding MLA (Multi-Head Latent Attention)
Understanding DeepSeek MoE.