Liao, M., Pang, G., Huang, J., Hassner, T., & Bai, X. (2020). Mask textspotter v3: Segmentation proposal network for robust scene text spotting.
Tan, M., & Le, Q. (2019, May). Efficientnet: Rethinking model scaling for convolutional neural networks.
Kim, S., Kim, D., Cho, M., & Kwak, S. (2020). Proxy anchor loss for deep metric learning.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., ... & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows.
An image is worth 16x16 words: Transformers for image recognition at scale
Cao, Y., Xu, J., Lin, S., Wei, F., & Hu, H. (2019). Gcnet: Non-local networks meet squeeze-excitation networks and beyond.
Zhu, Z., Xu, M., Bai, S., Huang, T., & Bai, X. (2019). Asymmetric non-local neural networks for semantic segmentation.
Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks.
Lee, J., Hayashi, H., Ohyama, W., & Uchida, S. (2019). Page segmentation using a convolutional neural network with trainable co-occurrence features
Ma, K., Shu, Z., Bai, X., Wang, J., & Samaras, D. (2018). Docunet: document image unwarping via a stacked U-Net.
Zhou, X., Wang, D., & Krähenbühl, P. (2019). Objects as points.
Law, H., & Deng, J. (2018). Cornernet: Detecting objects as paired keypoints.
Zhou, Xingyi, Jiacheng Zhuo, and Philipp Krahenbuhl. "Bottom-up object detection by grouping extreme and center points."
Zhang, Han, et al. "Self-attention generative adversarial networks."
Bai, Yancheng, et al. "Sod-mtgan: Small object detection via multi-task generative adversarial network."
Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).
Rothe, Sascha, Sebastian Ebert, and Hinrich Schütze. "Ultradense word embeddings by orthogonal transformation."