[AIFFEL] 22.Mar.28, GD_Class_Activation_Map_3

Deok Jong Moon·2022년 3월 28일

Going Deeper 교육후기 미니프로젝트

오늘의 학습 리스트

cv2.findContours(이미지, 모드, 근사화 방법)
- 이미지 안에 있는 contour들을 전부 찾아낸단다.
- 여기서 contours로 나오는 반환값 어레이는 2-D 벡터들이 모인 것인데, 그 contour의 한 선을 얘기하는 것 같다.(윤곽선의 지점들)
- 어떻게 하는 건가 궁금해서 찾아봤더니, 더 복잡하고 많은 개념들이 있다.(알고나면 쉬움)
- 공식 Documentation
- Medium post
cv2.minAreaRect()
- "Finds a rotated rectangle of the minimum area enclosing the input 2D point set"(여기서는 cv2.findContours의 결과로 나오는 contours)
- 그러면 이 contour들을 묶어주는 직사각형 좌표와 각도를 반환해준다.
- 그런데 이게 절대적으로 1개의 직사각형만 반환해주는 건 아닌가 보다...
- https://stackoverflow.com/questions/55587820/how-to-get-the-only-min-area-rectangle-on-a-multiple-contours-image-with-cv2-min
Grad-CAM, CAM 3채널 heatmap 같은 거로 보이게 하는 법
- https://keras.io/examples/vision/grad_cam/
- keras tutorial인데, 여기서 RGB heatmap으로 나오게 하는 코드 있음

미니프로젝트

: 논문과 실습을 통해 이해한 CAM, Grad-CAM을 구현해보고 둘을 통해 갖게 된 object localisation 결과물인 bounding box의 정확도를 비교해보자.

CAM 구현
- CAM 모델 구현
  - p 2. "just before the final output layer (softmax in the case of categorization), we perform global average pooling on the convolutional feature maps and use those as features for a fully-connected layer that produces the desired output (categorical or otherwise)."
- CAM 구하는 과정
  - p 2. "projecting back the weights of the output layer on to the convolutional feature maps, a technique we call class activation mapping"
  - p 2. "Similarly, we compute a weighted sum of the feature maps of the last convolutional layer to obtain our class activation maps."
  - p 2. "Here we ignore the bias term: we explicitly set the input bias of the softmax to 0 as it has little to no impact on the classification performance."
  - p 4. "In general, for each of these networks we remove the fully-connected layers before the final output and replace them with GAP followed by a fully-connected softmax layer."

Global Average Pooling vs Global Max Pooling
- 왜 Average로 선택하게 됐을까?
- p 3. "when doing the average of a map, the value can be maximized by finding all discriminative parts of an object as all low activations reduce the output of the particular map. On the other hand, for GMP, low scores for all image regions except the most discriminative one do not impact the score as you just perform a max."

헤맨 점

1) 최대한 실습 코드를 안 보고 짜려다 보니 뻔한 utility 함수 짜는 것도 시간이 굉장히 많이 걸린다.... 예를 들어 이미지 resize + normalise 하는 함수도 이걸 image 데이터만 넣으면 되는 줄 알았더니 생각해 보니 tf.data.Dataset은 (image, label) 형식으로 들어온다...

2) tfds.load('stanford_dogs', split=['train', 'test'], with_info=True, as_supervised=True) 이렇게 가져오면 원래 라벨 값에 bbox 같은 게 있어도 classification 클래스만 있는 tf.uint8 자료형 라벨만 나온다... bbox 등의 정보를 살려서 가져오고 싶다면 as_supervised를 지정 안 해주면 된다.

Deok Jong Moon

'어떻게든 자야겠어'라는 저 아이를 닮고 싶습니다

이전 포스트

[AIFFEL] 22.Mar.25, GD_Class_Activation_Map_2

다음 포스트

[AIFFEL] 22.Mar.28, GD_Class_Activation_Map_3

오늘의 학습 리스트

미니프로젝트

헤맨 점

[AIFFEL] 22.Mar.25, GD_Class_Activation_Map_2

[AIFFEL] 22.Mar.29, GD_Object_Detection

0개의 댓글