Producing ‘visual explanations’ for decisions from CNN-based models without decreasing performance of the model
Having ability to explain why they predict what they predict in order to build trust in intelligent systems
CAM work trades off model complexity and performance for more transparency into the working of the model
Proposing Grad-CAM model that generates visual explanations for CNN-based network without requiring architectural changes
Applying Grad-CAM to existing top-performing classification, captioning, VQA models
Proposing proof-of-concept of how interpretable Grad-CAM visualizations help in diagnosing failure modes by uncovering biases in datasets
Presenting Grad-CAM visualizations for ResNets applied to image classification and VQA
Help untrained users discern a ‘stronger’ network with others even they make identical predictions
Using the gradient information flowing into the last convolutional layer of the CNN
Assign importance values to each neuron for a particular decision
Flow back to obtain the neuron importance weights
Weight represents a partial linearization of the deep network from A, captures the feature map k for target class c
Linear Combination with ReLU activation function
Preventing from highlighting more than desired class and performing worse
Grad-CAM generalizes CAM
F(k) is GAP output
Rewrite and substitute ∑ ∑w(k)c to Z
Z is the number of pixels in the feature map
Grad-CAM is a strict generalization of CAM
Allowing to generate visual explanations from CNN-based models that cascade convolutional layers with more complex interactions(Image captioning, VQA)
Obtaining class predictions from network and generate Grad-CAM maps for each of the predicted classes and binarize them
Grad-CAM localization with pre-trained VGG-16: Grad-CAM localization errors are better than the error created by c-MWP(ILSVRC-15)
Segmenting objects with image-level annotation
New loss function for training weakly-supervised image segmentation models was proposed
Algorithms is sensitive to the choice of weak localization seed
Previously, CAM maps are used, replaced the CAM maps with Grad-CAM obtained from a standard VGG-16 network
Obtained a Intersection over Union (IoU) score of 49.6(44.6 obtained from CAM)
Obtaining category-specific visualizations using Deconvolution, Guided Backpropagation, and Deconvolution Grad-CAM and Guided Grad-CAM
Evaluating Trust
Guided Backpropagation received a slightly higher score for VGG-16 and Guided Grad-CAM received a higher score indicating that VGG-16 is clearly more reliable
AI의 탐지 결과 시각화의 초석이 되는 의미심장한 논문이다. 각종 이미지 데이터셋으로 실험을 해 보며, CheXNet 모델과 파인 튜닝하는 방법을 찾기 위해 노력해야겠다.