VLM(Vision-Language Model) to LMM(Large Multimodal Model)

Hyungseop Lee·2025년 12월 15일

[Paper Review] VLM to LMM

목록 보기

1/10

VLM에서 LMM으로 변해가는 context를 대략적으로 정리해본다

1. image-text alignment 중심 (understanding)

2. understanding + captioning(generating)

현재 여기 까지 읽었음 (25.12.15)

3. ?

Efficient Deep Learning

다음 포스트

[2021 CVPR] (Simple Review) VirTex: Learning Visual Representations from Textual Annotations

0개의 댓글