Learning Deep Bilinear Transformation for Fine-grained Image Representation 제2부

이준석·2022년 11월 2일

Learning Deep Bilinear Transformation for Fine-grained Image Representation

목록 보기

1/1

2.1 Fine-Grained Image Recognition

Bilinear pooling.

Bilinear pooling [10] is proposed to obtain rich and orderless global representation for the last convolutional feature, which achieved the state-of-the-art results in many fine-grained datasets.
쌍선형 풀링[10]은 마지막 컨볼루션 기능에 대해 풍부하고 순서 없는 전역 표현을 얻기 위해 제안되었으며, 이는 많은 세분화된 데이터 세트에서 최첨단 결과를 달성했습니다.

However, the high-dimensionality issue is caused by calculating pairwise interaction between channels, thus dimension reduction methods are proposed. Specifically, low-rank bilinear pooling [15] proposed to reduce feature dimensions before conducting bilinear transformation, and compact bilinear pooling [14] proposed a sampling based approximation method, which can reduce feature dimensions by two orders of magnitude without performance drop.
그러나 채널 간의 쌍방향 상호작용을 계산함으로써 고차원 문제가 발생하므로 차원 축소 방법을 제안한다. 구체적으로, 낮은 순위 쌍선형 풀링[15]은 쌍선형 변환을 수행하기 전에 특징 차원을 줄이는 것을 제안하고, 컴팩트 쌍선형 풀링[14]은 성능 저하 없이 특징 차원을 100배 줄일 수 있는 샘플링 기반 근사법을 제안했습니다.

Different from them, we reduce feature dimension by intra-group bilinear transformation and inter-group aggregating, and a detailed discussion can be found in Section 3.4.
이와 달리 그룹 내 쌍선형 변환 및 그룹 간 집계를 통해 특성 차원을 줄이며 자세한 논의는 3.4절에서 찾을 수 있습니다.

Moreover, feature matrix normalization [11–13] (e.g., matrix square-root normalization) is proved to be important for bilinear feature, while we do not use such technics in our deep bilinear transformation since calculating such root is expensive and not practical to be deeply stacked in CNNs.
또한, 특징 행렬 정규화 [11–13](예: 행렬 제곱근 정규화)는 이중 선형 특징에 중요한 것으로 입증되었지만, 우리는 심층 이중 선형 변환에 이러한 기술을 사용하지 않는다. 왜냐하면 이러한 루트를 계산하는 것은 비용이 많이 들고 CNN에서 깊이 쌓이는 것이 실용적이지 않기 때문이다.

이준석

인공지능 전문가가 될레요

Learning Deep Bilinear Transformation for Fine-grained Image Representation 제2부