Alphafold2
Noisy student
Bert
Bert
Attention is all you need
RQ-VAE
gpt understand too
graphormer
ESM2
attention
anomalyclip
winclip
Multiscale Vision Transformers
젯슨 라바
Track Anything: Segment Anything Meets Videos (TAM)
qwen2vl
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
LLAVA-Med