LLM 최적화

1.FlashAttention

post-thumbnail

2.데이터 타입

post-thumbnail

3.Gradient 누적

post-thumbnail

4.Gradient Checkpointing

post-thumbnail

5.ZeRO (Zero Redundancy Optimizer)

post-thumbnail

6.최적의 배치크기 계산

post-thumbnail

7.Group Query Attention

post-thumbnail

8.flash attention

post-thumbnail

9.flash attention

post-thumbnail

10.동적 디코딩(ALiBi)

post-thumbnail

11.paged attention vs flash attention

post-thumbnail

12.paged attention

post-thumbnail

13.vLLM

post-thumbnail

14.Self-Attention과 KV 캐시

post-thumbnail

15.KV캐쉬 최적화

post-thumbnail

16.멀티 쿼리 어텐션(MQA), 그룹 쿼리 어텐션(GQA)

post-thumbnail