[AIFFEL] 22.Apr.21, AIFFELTHON

Deok Jong Moon·2022년 4월 21일
0
post-thumbnail

오늘의 학습 리스트

  • cardinaltiy라는 단어가 CNN에서 쓰인다.
    • CSPNet 논문에 나오길래 무슨 뜻인가 봤는데("In ResNeXt [39], Xie et al. first demonstrate that cardinality can be more effective than the dimensions of width and depth.")
    • 뭔가 GPU에서 parallel 하게 연산시키는 걸 뜻하는 듯하다.
    • 해당 링크 들어가서 읽어보려 하니 더 깊어지는 것 같아 pass...
  • torch.Tensor.contiguous()
  • torch.nn.ModuleList(modules=None)
    • Holds submodules in a list.
  • Anchor-free based object detector
    • 요 근래 논문 읽으면서 조금씩 봐왔던 단어인데, 나름 트렌드(?)인 것 같아서 찾아봤다.
  • SPPNet
    • 문제 제기
      • p 1. "CNNs require a fixed input image size (e.g., 224 224), which limits both the aspect ratio and the scale of the input image"
      • p 1. "In fact, convolutional layers do not require a fixed image size and can generate feature maps of any sizes. On the other hand, the fully-connected layers need to have fixedsize/length input by their definition. Hence, the fixedsize constraint comes only from the fully-connected layers, which exist at a deeper stage of the network."
    • 제시하는 해결 방법
      • p 1. "Specifically, we add an SPP layer on top of the last convolutional layer. The SPP layer pools the features and generates fixedlength outputs, which are then fed into the fullyconnected layers (or other classifiers)"
        -
    • SPP layer
      • p 3. "Spatial pyramid pooling [14], [15] improves BoW in that it can maintain spatial information by pooling in local spatial bins. These spatial bins have sizes proportional to the image size, so the number of bins is fixed regardless of the image size"
      • p 3. "To adopt the deep network for images of arbitrary sizes, we replace the last pooling layer (e.g., pool5, after the last convolutional layer) with a spatial pyramid pooling layer. Figure 3 illustrates our method. In each spatial bin, we pool the responses of each filter (throughout this paper we use max pooling). The outputs of the spatial pyramid pooling are kM-dimensional vectors with the number of bins denoted as M (k is the number of filters in the last convolutional layer). The fixed-dimensional vectors are the input to the fully-connected layer."
      • 어떻게 해서 fix-sized length vector가 탄생할까?
      • p 4. "For the network to accept 180 180 inputs, we implement another fixed-size-input (180 180) network. The feature map size after conv5 is a a = 10 10 in this case. Then we still use win = da=ne and str = ba=nc to implement each pyramid pooling level. The output of the spatial pyramid pooling layer of this 180-network has the same fixed length as the 224-network"
profile
'어떻게든 자야겠어'라는 저 아이를 닮고 싶습니다

0개의 댓글