paper
: YOLOv3: An Incremental Improvementauthor
: Joseph Redmon, Ali Farhadisubject
: 이 논문은 TECH REPORT에 가깝다.
YOLO9000과 같이 YOLOv3는 dimension clusters로써 anchor boxes를 사용하여 bounding box를 예측한다.
(사전에 설정한 anchor box(검은 점선)를 갖고 bbox를 예측.)
network는 각 bbox마다, 4개의 coordinates 를 예측한다.
만약 cell이 image 왼쪽 상단으로부터 offset을 갖고 있고,
bbox의 width and height가 라면,
prediction은 다음과 같다.Durining training we use sm of squared error loss.
YOLOv3 predicts an objectness socre (=confidence score?) for each bbox using logistic regression.
만약 다른 bbox prior보다 ground truth와 IOU가 크다면, objectness score가 1이 되어야 함.
Unlike [17] our system only assigns one bounding box prior for each ground truth object.
If a bounding box prior is not assigned to a ground truth object
it incurs no loss for coordinate or class predictions, only objectness.
YOLOv3 predicts boxes at 3 different scales.
We predict 3 boxes at each scale.
The last of predicts a 3-d tensor encoding bbox, objectness, and class predictions.
We still use -means clustering to determine our bbox priors.
We just sort of chose 9 clusters and 3 scales arbitrarily and then
divide up the clusters evenly across scales.