action recognition 논문들

FSA·2024년 2월 23일

action recognition in videos

목록 보기

14/24

1. InternVideo: General Video Foundation Models via Generative and Discriminative Learning

https://arxiv.org/pdf/2212.03191v2.pdf
2022, 118회 인용

2. Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

https://arxiv.org/pdf/2212.03229v1.pdf
2023, 23회 인용

3. Unmasked Teacher: Towards Training-Efficient Video Foundation Models

https://arxiv.org/pdf/2303.16058v1.pdf
2023, 35회 인용

4. Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning

https://arxiv.org/pdf/2212.03229v1.pdf
2023, 27회 인용

5. UNIFORMERV2: SPATIOTEMPORAL LEARNING BY ARMING IMAGE VITS WITH VIDEO UNIFORMER

2022, 50회 인용
https://openreview.net/pdf?id=d77RVuVg-Mf

6. Masked Feature Prediction for Self-Supervised Visual Pre-Training

2022, 480회 인용
https://arxiv.org/pdf/2112.09133v2.pdf
https://velog.io/@hsbc/Masked-Feature-Prediction-for-Self-Supervised-Visual-Pre-Training

7. Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning

2023, 25회 인용
https://arxiv.org/pdf/2212.04500v2.pdf

8. Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

2023, 15회 인용
https://arxiv.org/pdf/2306.00989v1.pdf

9. CoCa: Contrastive Captioners are Image-Text Foundation Models

2022, 761회 인용
https://arxiv.org/pdf/2205.01917.pdf
https://velog.io/@hsbc/CoCa-Contrastive-Captioners-are-Image-Text-Foundation-Models

10. Multiview Transformers for Video Recognition

11. MERLOT: Multimodal Neural Script Knowledge Models

2021, 291회 인용
https://proceedings.neurips.cc/paper_files/paper/2021/file/c6d4eb15f1e84a36eff58eca3627c82e-Paper.pdf

모든 의사 결정 과정을 지나칠 정도로 모두 기록하고, 나중에 스스로 피드백 하는 것

이전 포스트

[22][1010] VideoMAE : Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

다음 포스트

[Video foundation model] 논문 공부 리스트

0개의 댓글