Paper

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Picsart AI에서 쓴 T2V 모델 paper. streaming이 가능하다고 한다.
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
3D alignment -> video를 만드는 T2V, I2V 모델. 코드 정리가 잘 되어 있어 응용할 수 있을 것 같다.
VidLA: Video-Language Alignment at Scale
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
빠르고 쉽게 할 수 있는 LLM fine-tuning 방법을 제시한다.
LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis
NVIDIA에서 나온 paper. A6000가지고 1초 걸린다는데..
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding
video의 맥락 파악
Can large language models explore in-context?
여러 시중의 LLM을 가지고 비교분석. 과연 extrapolation을 잘하는 모델은?

[Daily paper] 24-03-26