Vision Transformer

1.GLiT: Neural Architecture Search for Global and Local Image Transformer

post-thumbnail

2.CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image

post-thumbnail

3.Visformer

post-thumbnail

4.GOING Deeper with Image Transformers

post-thumbnail