[Simple Review] [2022 Applied Intelligence] Attention-based fusion factor in FPN for object detection

Hyungseop Lee·2025년 8월 19일

[Paper Review] Feature Fusion(Alignment) Networks

목록 보기

16/16

배경: 최근, 대부분의 detectors들은 다양한 크기의 objects를 detect하기 위해 feature pyramid를 활용함.
문제 제기: 하지만 existing FPN-based feature extraction networks는 effective semantic information을 capturing하는 데에 더 집중하고
FPN feature fusion process에서 dataset scale distribution에 대한 영향은 무시한다.
(내 생각: 이게 어떤 악영향이 있는지? 왜 문제인건지? 디테일한 설명이 필요해 보임)
핵심 아이디어, 제안: 이 문제를 해결하기 위해, 우리는 any FPN-based network model에 적용할 수 있는 a novel attention structure를 제안한다.
itself에 attention하는 general attention과는 달리, 우리의 방법은 feature fusion에서 인접한 layer 중 lower layer feature의 영향을 더 잘 활용할 수 있고,
이는 upper layer feature의 filtering을 guide한다.
same sample에 대해 서로 다른 feature map들의 feature information의 차이를 고려함으로써, lower layer에 대해서 invalid sample features of the upper layer를 더 잘 filter out할 수 있다.
우리 방법은 shallow learning에 참여하는 deep features들의 degree(정도)를 더 잘 학습할 수 있게 하여, FPN의 각 layer가 own layer learning에 더 집중할 수 있도록 한다.(?) (내 생각: 이 문장은 abstract여도 너무 추상적이라 생각이 듦...)

(배경)

multi-scale object detection에서, the fusion of features with different scales은 model's performance에 매우 중요.
low-level feature는 higher resolution을 가져서 more position and detailed information을 갖지만,
fewer convolutions으로 인해 lower semantics and more noises를 가짐.
high-level feature는 stronger semantic information을 갖지만, resolution이 매우 작아서, the perception of details은 poor함.
이 두 feature를 efficiently integrate하는 방법은 model을 향상시키는 데에 중요함/

(문제 제기)

FPN은 계속 발전되어 옴.
SFAM, PANet...
하지만, 그들은 feature fusion에서 dataset 안에 object scale distribution의 영향을 고려하지 않았고, 모든 input features를 동등하게 대했다. (treated equally)
output feature에 대한 서로 다른 resolutions에서의 input features의 contribution은 dataset에서 the scale distribution of objects에 영향을 받을 것이고, 이는 unequal하다.
하지만 FPN 기반의 detection methods들은 effective semantic information을 capturing하는 데에 집중하고
the influence of the object scale distribution in the dataset은 고려하지 않았다.

(이게 구체적으로 어떤 문제를 야기하는가?)

(관련 연구)

(제안)

AugFPN은 FPN의 feature pyramid에서 some design flaws를 지적하고 consistency supervision으로 해결했고, residual feature enchancement and soft RoI selection으로 better detection results를 달성.
SNIP은 먼저 small scale과 pretraining model의 scale 간의 관계를 분석하고, domain-shift의 영향을 줄이기 위해 a scale-normalized training mechanism을 제안했다.
...