Towards Interpretable Object Detection by Unfolding Latent Structures
Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention
Building explainable ai evaluation for autonomous perception
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding (aclweb.org)
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Attention is not not Explanation
Stand-Alone Self-Attention in Vision Models
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
On the relationship between self-attention and convolutional layers
즉, self-attention map의 개념에 대해 언급된 논문 중 하나.
Image transformer. In: ICML (2018)
Attention augmented convolutional networks. In: ICCV (2019)
positional encodings 관련 개념만 발췌하자.
XAI
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition
Explainable Vision Transformer Based COVID-19 Screening Using Radiograph
Attention
TransPose: Keypoint Localization via Transformer
Transformer Interpretability Beyond Attention Visualization
Positional Encodings
Conditional Positional Encodings for Vision Transformers
Adversarial
On the Adversarial Robustness of Visual Transformers