[Paper Review] Hybrid(Conv & Transformer) Architecture

1.CvT: Introducing Convolutions to Vision Transformers

post-thumbnail

2.[Simple Review] [DeiT] Training data-efficient image transformers & distillation through attention

post-thumbnail

3.LeViT: a Vision Transformer in ConvNet’s Clothing for Faster Inference

post-thumbnail

4.[Simple Review] CoAtNet: Marrying Convolution and Attention for All Data Sizes

post-thumbnail