๐Ÿ‘€ YOLO(You Look Only Once)

Jaeyeon Heoยท2025๋…„ 8์›” 28์ผ
post-thumbnail

YOLO๋Š” ์‹ค์‹œ๊ฐ„ ๊ฐ์ฒด ๊ฒ€์ถœ(Real-time Object Detection)์„ ์œ„ํ•ด ๊ฐœ๋ฐœ๋œ ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ „ํ†ต์ ์ธ ๊ฐ์ฒด ๊ฒ€์ถœ ๋ฐฉ์‹๊ณผ ๋‹ฌ๋ฆฌ, YOLO๋Š” ์ด๋ฏธ์ง€๋ฅผ ํ•œ ๋ฒˆ์— ์ „์ฒด์ ์œผ๋กœ ๋ถ„์„ํ•˜์—ฌ ๊ฐ์ฒด๋ฅผ ๊ฒ€์ถœํ•˜๊ณ , ๊ฐ ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ํด๋ž˜์Šค ์ •๋ณด๋ฅผ ๋™์‹œ์— ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹ ๋•๋ถ„์— ๋†’์€ ์†๋„์™€ ๋น„๊ต์  ์•ˆ์ •์ ์ธ ์ •ํ™•๋„๋ฅผ ๋™์‹œ์— ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•ต์‹ฌ ์•„์ด๋””์–ด

  • ์ด๋ฏธ์ง€๋ฅผ S x S ๊ทธ๋ฆฌ๋“œ๋กœ ๋ถ„ํ• 
  • ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์—์„œ ๊ฐ์ฒด ์ค‘์‹ฌ์ด ์กด์žฌํ•˜๋ฉด ๊ฒ€์ถœ ๋‹ด๋‹น
  • Bounding Box + Confidence Score + Class Probability ๋™์‹œ ์˜ˆ์ธก

ํŠน์ง•

  • ํ•œ ๋ฒˆ์˜ forward pass๋กœ ์ „์ฒด ์ด๋ฏธ์ง€ ๊ฐ์ฒด ๊ฒ€์ถœ ๊ฐ€๋Šฅ
  • GoogLeNet ๊ธฐ๋ฐ˜ ๊ตฌ์กฐ โ†’ Convolutional Layer๋กœ ํŠน์ง• ์ถ”์ถœ, Fully Connected Layer๋กœ ์ขŒํ‘œ/ํด๋ž˜์Šค ์˜ˆ์ธก
  • ๋‹ค์–‘ํ•œ ํฌ๊ธฐ, ์ข…ํšก๋น„์˜ ๊ฐ์ฒด๋ฅผ ๋™์‹œ์— ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ
  • ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ ์ตœ์ ํ™”

์žฅ์ 

  • ๋น ๋ฅธ ์†๋„์™€ ์•ˆ์ •์ ์ธ ์ •ํ™•๋„
  • ๋‹จ์ผ ๋„คํŠธ์›Œํฌ๋กœ ๊ฐ์ฒด ์กด์žฌ ์—ฌ๋ถ€, ์œ„์น˜, ํด๋ž˜์Šค ์ •๋ณด ๋ชจ๋‘ ์˜ˆ์ธก
  • ๊ธฐ์กด ๋ฐฉ์‹(Region Proposal ๊ธฐ๋ฐ˜) ๋Œ€๋น„ ๋‹จ์ˆœํ•˜๊ณ  ํšจ์œจ์ 

๐Ÿ“Œ Unified Detection

YOLO(You Only Look Once)๋Š” ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ๐‘†ร—๐‘† ๊ทธ๋ฆฌ๋“œ๋กœ ๋ถ„ํ• ํ•˜์—ฌ ๊ฐ์ฒด๋ฅผ ๊ฒ€์ถœํ•ฉ๋‹ˆ๋‹ค. ์–ด๋–ค ๊ฐ์ฒด์˜ ์ค‘์‹ฌ์ด ํŠน์ • ๊ทธ๋ฆฌ๋“œ ์…€ ์•ˆ์— ์œ„์น˜ํ•˜๋ฉด, ๊ทธ ์…€์—์„œ ํ•ด๋‹น ๊ฐ์ฒด๋ฅผ ๊ฐ์ง€ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์€ ๐ต๊ฐœ์˜ bounding box์™€ ๊ฐ ๋ฐ•์Šค์˜ confidence score๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

Confidence score: bounding box๊ฐ€ ์‹ค์ œ ๊ฐ์ฒด๋ฅผ ํฌํ•จํ•  ํ™•๋ฅ ๊ณผ ์˜ˆ์ธก๋œ ๋ฐ•์Šค๊ฐ€ ๊ฐ์ฒด์— ์–ผ๋งˆ๋‚˜ ์ž˜ ๋งž๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ

๊ฐ bounding box๋Š” ๋‹ค์Œ 5๊ฐ€์ง€ ์ •๋ณด๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

X, Y: ๊ทธ๋ฆฌ๋“œ ์…€ ๋‚ด์—์„œ bounding box ์ค‘์‹ฌ์˜ ์ƒ๋Œ€ ์ขŒํ‘œ

W, H: ์ด๋ฏธ์ง€ ์ „์ฒด์— ๋Œ€ํ•œ ์ƒ๋Œ€ ๋„ˆ๋น„์™€ ๋†’์ด

Confidence: ๋ฐ•์Šค์˜ ์‹ ๋ขฐ๋„

๋˜ํ•œ, ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์€ Conditional Class Probability๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

Conditional Class Probability: ๊ทธ๋ฆฌ๋“œ ์…€ ์•ˆ์— ๊ฐ์ฒด๊ฐ€ ์กด์žฌํ•œ๋‹ค๋Š” ์กฐ๊ฑด ํ•˜์—์„œ, ํ•ด๋‹น ๊ฐ์ฒด๊ฐ€ ํŠน์ • ํด๋ž˜์Šค์ผ ํ™•๋ฅ 

ํ…Œ์ŠคํŠธ ๋‹จ๊ณ„์—์„œ๋Š” Conditional Class Probability์™€ bounding box์˜ Confidence Score๋ฅผ ๊ณฑํ•˜์—ฌ Class Specific Confidence Score๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

lass Specific Confidence Score: bounding box๊ฐ€ ํŠน์ • ํด๋ž˜์Šค ๊ฐ์ฒด๋ฅผ ํฌํ•จํ•  ํ™•๋ฅ ๊ณผ ํ•ด๋‹น ๋ฐ•์Šค๊ฐ€ ๊ฐ์ฒด์— ์–ผ๋งˆ๋‚˜ ์ž˜ ๋งž๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’

YOLO์˜ ์•„ํ‚คํ…์ฒ˜๋Š” Google์˜ GoogLeNet ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ์˜๊ฐ์„ ๋ฐ›์•˜์œผ๋ฉฐ, 24๊ฐœ์˜ Convolutional Layer์™€ 2๊ฐœ์˜ Fully Connected Layer๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

Convolutional Layers: ์ด๋ฏธ์ง€ ํŠน์ง•(feature) ์ถ”์ถœ

Fully Connected Layers: ํด๋ž˜์Šค ํ™•๋ฅ  ๋ฐ bounding box ์ขŒํ‘œ ์˜ˆ์ธก

GoogLeNet์˜ ์ธ์…‰์…˜ ๋ชจ๋“ˆ๊ณผ ๋‹ฌ๋ฆฌ, YOLO๋Š” 1ร—1 ์ถ•์†Œ ๊ณ„์ธต(reduction layer)๊ณผ 3ร—3 ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต(convolutional layer)์œผ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค.

๐Ÿ“Œ Training

YOLO๋Š” ImageNet ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๊ฐ์ฒด ๊ฒ€์ถœ ๋ชจ๋ธ๋กœ ์ „ํ™˜ํ•˜์—ฌ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ๊ตฌ์กฐ๋Š” 20๊ฐœ์˜ Convolutional Layers์™€ 2๊ฐœ์˜ Fully Connected Layers๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ, ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ์ถ”๊ฐ€๋กœ 4๊ฐœ์˜ Convolutional Layers์™€ 2๊ฐœ์˜ Fully Connected Layers๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

๋งˆ์ง€๋ง‰ ๊ณ„์ธต: ์„ ํ˜• ํ™œ์„ฑํ™” ํ•จ์ˆ˜
๋‚˜๋จธ์ง€ ๊ณ„์ธต: Leaky ReLU

Loss ํ•จ์ˆ˜๋กœ๋Š” SSE(sum-squared error)๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. SSE๋Š” ํ•™์Šต ์ตœ์ ํ™”๊ฐ€ ์‰ฝ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์ง€๋งŒ, Localization Loss์™€ Classification Loss๊ฐ€ ๋™์ผํ•œ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ€์ ธ mAP ํ–ฅ์ƒ์—๋Š” ์ตœ์ ์ด ์•„๋‹ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Localization Loss: bounding box ์œ„์น˜ ์˜ˆ์ธก ์ •ํ™•๋„

Classification Loss: ํด๋ž˜์Šค ์˜ˆ์ธก ์ •ํ™•๋„

ํ•™์Šต ๊ณผ์ •์—์„œ ๋Œ€๋ถ€๋ถ„์˜ ๊ทธ๋ฆฌ๋“œ ์…€์—๋Š” ๊ฐ์ฒด๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์—, ํ•™์Šต์ด ๋ถˆ๊ท ํ˜•์ ์œผ๋กœ ์ด๋ฃจ์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฅผ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด

๊ฐ์ฒด๊ฐ€ ์žˆ๋Š” bounding box: Loss ์ฆ๊ฐ€
๊ฐ์ฒด๊ฐ€ ์—†๋Š” bounding box: Loss ๊ฐ์†Œ

๋˜ํ•œ, ์ž‘์€ bounding box๋ณด๋‹ค ํฐ bounding box๋Š” ์œ„์น˜ ๋ณ€ํ™”์— ๋œ ๋ฏผ๊ฐํ•˜๋ฏ€๋กœ, Width์™€ Height์— ์ œ๊ณฑ๊ทผ(square root)์„ ์ ์šฉํ•˜์—ฌ Loss ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.
bounding box ์„ ํƒ ์‹œ, ์—ฌ๋Ÿฌ ํ›„๋ณด ์ค‘ IoU(Intersection over Union)๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ๋ฐ•์Šค๋ฅผ ์„ ํƒํ•˜์—ฌ ๊ฐ์ฒด๋ฅผ ์ •ํ™•ํžˆ ๊ฐ์‹ธ๋„๋ก ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ํ†ตํ•ด YOLO๋Š” ๋‹ค์–‘ํ•œ ํฌ๊ธฐ, ์ข…ํšก๋น„, ํด๋ž˜์Šค ๊ฐ์ฒด๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

๐Ÿ“ŒYOLO ์•„ํ‚คํ…์ฒ˜ ๊ฐœ์š”

  1. ์ž…๋ ฅ ์ด๋ฏธ์ง€ -> S x S ๊ทธ๋ฆฌ๋“œ๋กœ ๋ถ„ํ• 
    โ””> ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์€ ๊ฐ์ฒด ์ค‘์‹ฌ์ด ์กด์žฌํ•˜๋ฉด ๊ฐ์ง€ ๋‹ด๋‹น

  2. ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€ ์˜ˆ์ธก
    โ”œโ”€ B๊ฐœ์˜ Bounding Box
    โ”‚ โ”œโ”€ X, Y : ๊ทธ๋ฆฌ๋“œ ๋‚ด ์ค‘์‹ฌ ์ขŒํ‘œ
    โ”‚ โ”œโ”€ W, H : ์ด๋ฏธ์ง€ ์ „์ฒด์— ๋Œ€ํ•œ ์ƒ๋Œ€ ๋„ˆ๋น„/๋†’์ด
    โ”‚ โ””โ”€ Confidence Score : ๋ฐ•์Šค ์‹ ๋ขฐ๋„
    โ””โ”€ Conditional Class Probability
    โ””> ๊ทธ๋ฆฌ๋“œ ์…€์— ๊ฐ์ฒด๊ฐ€ ์žˆ์„ ์กฐ๊ฑด ํ•˜์— ํด๋ž˜์Šค ํ™•๋ฅ 

  3. ํ…Œ์ŠคํŠธ ๋‹จ๊ณ„
    โ””โ”€ Conditional Class Probability ร— Confidence Score
    โ””> Class Specific Confidence Score
    โ””> ํ•ด๋‹น ํด๋ž˜์Šค ๊ฐ์ฒด๊ฐ€ ๋ฐ•์Šค์— ์กด์žฌํ•  ํ™•๋ฅ  ร— ๋ฐ•์Šค ์ ํ•ฉ๋„

  4. ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ
    โ”œโ”€ Convolutional Layers (24๊ฐœ)
    โ”‚ โ””> ์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ
    โ”œโ”€ Fully Connected Layers (2๊ฐœ)
    โ”‚ โ””> ํด๋ž˜์Šค ํ™•๋ฅ  ๋ฐ bounding box ์ขŒํ‘œ ์˜ˆ์ธก
    โ””โ”€ ํŠน์ง•
    โ”œโ”€ 1x1 ์ถ•์†Œ ๊ณ„์ธต
    โ””โ”€ 3x3 ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต
    โ””> GoogLeNet ์ธ์…‰์…˜ ๋ชจ๋“ˆ๊ณผ ๋‹ค๋ฆ„

  5. ํ•™์Šต(Training)
    โ”œโ”€ ์‚ฌ์ „ ํ•™์Šต: ImageNet ๊ธฐ๋ฐ˜ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์ „ํ™˜
    โ”œโ”€ ์ถ”๊ฐ€ ๊ณ„์ธต: +4 Conv, +2 FC (์„ฑ๋Šฅ ํ–ฅ์ƒ)
    โ”œโ”€ ํ™œ์„ฑํ™” ํ•จ์ˆ˜
    โ”‚ โ”œโ”€ ๋งˆ์ง€๋ง‰ ๊ณ„์ธต: ์„ ํ˜•
    โ”‚ โ””โ”€ ๋‚˜๋จธ์ง€: Leaky ReLU
    โ””โ”€ Loss: SSE (Sum-Squared Error)
    โ”œโ”€ Localization Loss: bounding box ์œ„์น˜ ์ •ํ™•๋„
    โ”œโ”€ Classification Loss: ํด๋ž˜์Šค ์˜ˆ์ธก ์ •ํ™•๋„
    โ”œโ”€ ํ•™์Šต ๋ถˆ๊ท ํ˜• ๋ณด์ •
    โ”‚ โ”œโ”€ ๊ฐ์ฒด ์žˆ๋Š” ๋ฐ•์Šค: Loss ์ฆ๊ฐ€
    โ”‚ โ””โ”€ ๊ฐ์ฒด ์—†๋Š” ๋ฐ•์Šค: Loss ๊ฐ์†Œ
    โ””โ”€ Bounding Box ๊ฐ€์ค‘์น˜ ์กฐ์ •
    โ”œโ”€ W, H โ†’ sqrt ์ ์šฉ (์ž‘์€ ๋ฐ•์Šค ๋ฏผ๊ฐ๋„ ์ฆ๊ฐ€)
    โ””โ”€ IoU ์ตœ๋Œ€ ๋ฐ•์Šค ์„ ํƒ (์ •ํ™•ํ•œ ๊ฐ์ฒด ๊ฐ์‹ธ๊ธฐ)

  6. ๊ฒฐ๊ณผ
    โ””โ”€ ๋‹ค์–‘ํ•œ ํฌ๊ธฐ, ์ข…ํšก๋น„, ํด๋ž˜์Šค ๊ฐ์ฒด ์˜ˆ์ธก ๊ฐ€๋Šฅ


์งˆ๋ฌธ ๋ชจ์Œ

โ‰๏ธ ํ•˜๋‚˜์˜ ๊ทธ๋ฆฌ๋“œ ์…€์ด ์—ฌ๋Ÿฌ ๊ฐ์ฒด๋ฅผ ํฌํ•จํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ์ฒ˜๋ฆฌ๋ ๊นŒ?

YOLO๋Š” ํ•œ ๊ทธ๋ฆฌ๋“œ ์…€๋‹น ํ•˜๋‚˜์˜ ๊ฐ์ฒด๋งŒ ๋‹ด๋‹นํ•ฉ๋‹ˆ๋‹ค.
๋งŒ์•ฝ ์…€ ์•ˆ์— ๊ฐ์ฒด๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ๋ฉด, ๋ชจ๋ธ์€ ๊ฐ€์žฅ ์ค‘์‹ฌ์ด ์…€์— ๊ฐ€๊นŒ์šด ๊ฐ์ฒด๋งŒ ์˜ˆ์ธกํ•˜๊ณ  ๋‚˜๋จธ์ง€๋Š” ๋‹ค๋ฅธ ์…€์—์„œ ๊ฐ์ง€๋˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

โ‰๏ธ B๊ฐœ์˜ bounding box๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ผ๊นŒ?

ํ•˜๋‚˜์˜ ์…€์—์„œ ๊ฐ์ฒด ์œ„์น˜๊ฐ€ ์ •ํ™•ํžˆ ์–ด๋””์ธ์ง€ ๋ชจ๋ฅผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์—ฌ๋Ÿฌ ํ›„๋ณด ๋ฐ•์Šค(B๊ฐœ)๋ฅผ ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.
ํ•™์Šต ์ค‘ ๊ฐ€์žฅ ์‹ค์ œ ๊ฐ์ฒด์™€ ์ž˜ ๋งž๋Š” ๋ฐ•์Šค๋ฅผ ์„ ํƒํ•ด ์ตœ์ข… ์˜ˆ์ธก์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

โ‰๏ธ Loss ํ•จ์ˆ˜๋กœ SSE(sum-squared error)๋ฅผ ์“ฐ๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ผ๊นŒ?

SSE๋Š” ์œ„์น˜, ํฌ๊ธฐ, ํด๋ž˜์Šค ์˜ˆ์ธก ๋ชจ๋‘ ๋™์ผํ•˜๊ฒŒ ๊ฐ€์ค‘์น˜๋ฅผ ๋‘๊ณ  ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์–ด ๊ตฌํ˜„๊ณผ ์ตœ์ ํ™”๊ฐ€ ์‰ฝ์Šต๋‹ˆ๋‹ค.
๋‹จ์ : ์ž‘์€ ๊ฐ์ฒด๋‚˜ ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•์—๋Š” ์ตœ์ ํ™”๊ฐ€ ์ž˜ ์•ˆ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

โ‰๏ธ Bounding Box ์„ ํƒ ์‹œ IoU๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ๊ฒƒ์„ ์„ ํƒํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์–ด๋–ค ์˜๋ฏธ์ผ๊นŒ?

IoU(Intersection over Union)๊ฐ€ ๋†’์€ ๋ฐ•์Šค๋Š” ์‹ค์ œ ๊ฐ์ฒด๋ฅผ ๊ฐ€์žฅ ์ž˜ ๊ฐ์‹ธ๋Š” ๋ฐ•์Šค์ž…๋‹ˆ๋‹ค.
ํ•™์Šต๊ณผ์ •์—์„œ ์ด๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์„ ํƒํ•˜๋ฉด ๋ชจ๋ธ์ด ๊ฐ์ฒด ์œ„์น˜๋ฅผ ์ •ํ™•ํžˆ ์˜ˆ์ธกํ•˜๋„๋ก ์œ ๋„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

โ‰๏ธ YOLO๊ฐ€ ์‹ค์‹œ๊ฐ„ ๊ฐ์ฒด ๊ฒ€์ถœ์— ๊ฐ•ํ•œ ์ด์œ ๋Š” ๋ฌด์—‡์ผ๊นŒ?

์ด๋ฏธ์ง€ ์ „์ฒด๋ฅผ ํ•œ ๋ฒˆ์— ์ฒ˜๋ฆฌํ•ด ๋ชจ๋“  ๊ฐ์ฒด๋ฅผ ๋™์‹œ์— ์˜ˆ์ธกํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
Region Proposal ๊ฐ™์€ ๋‹จ๊ณ„์  ์ฒ˜๋ฆฌ ์—†์ด ๋‹จ์ผ ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ๋กœ ๋๋‚ด๊ธฐ ๋•Œ๋ฌธ์— ์†๋„๊ฐ€ ๋น ๋ฆ…๋‹ˆ๋‹ค.

โ‰๏ธ YOLO๊ฐ€ ์ž‘์€ ๊ฐ์ฒด๋ฅผ ์ž˜ ์žก์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋Š”๋ฐ, ์ด์œ ๊ฐ€ ๋ญ˜๊นŒ?

์ž‘์€ ๊ฐ์ฒด๋Š” ๊ทธ๋ฆฌ๋“œ ์…€ ํ•˜๋‚˜์— ์™„์ „ํžˆ ๋“ค์–ด๊ฐ€๊ฑฐ๋‚˜, ์…€ ํฌ๊ธฐ์— ๋น„ํ•ด ๋„ˆ๋ฌด ์ž‘์•„์„œ ๊ฐ์ฒด ์ค‘์‹ฌ์ด ์ •ํ™•ํžˆ ๋งž์ง€ ์•Š์œผ๋ฉด ๊ฐ์ง€ ์–ด๋ ค์›€
๊ทธ๋ฆฌ๋“œ ๋ถ„ํ•  ๋ฐฉ์‹๊ณผ ์ƒ๋Œ€ ์ขŒํ‘œ ๊ณ„์‚ฐ ๋•Œ๋ฌธ์— ์ž‘์€ ๊ฐ์ฒด์— ๋Œ€ํ•œ ์˜ˆ์ธก ์ •ํ™•๋„๊ฐ€ ๋‚ฎ์•„์ง‘๋‹ˆ๋‹ค.

โ‰๏ธ YOLO ๊ฒฐ๊ณผ๋ฅผ ํ›„์ฒ˜๋ฆฌ(NMS ๋“ฑ) ์—†์ด ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„๊นŒ?

๋ณดํ†ต์€ Non-Maximum Suppression(NMS)์„ ์‚ฌ์šฉํ•ด ๊ฒน์น˜๋Š” ๋ฐ•์Šค๋ฅผ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค.
ํ›„์ฒ˜๋ฆฌ ์—†์ด ๋ฐ”๋กœ ์“ฐ๋ฉด ๋™์ผ ๊ฐ์ฒด๊ฐ€ ์—ฌ๋Ÿฌ ๋ฐ•์Šค๋กœ ์ค‘๋ณต ๊ฒ€์ถœ๋  ์ˆ˜ ์žˆ์–ด ์‹ค๋ฌด์—์„œ๋Š” ๊ฑฐ์˜ ํ•„์ˆ˜์ž…๋‹ˆ๋‹ค.


References

profile
ํ•ด๋ณด๊ณ  ์‹ถ์€ ๊ฒƒ์ด ๋„ˆ๋ฌด๋‚˜๋„ ๋งŽ์€ ํ•˜๊ณ ์žก์ด

0๊ฐœ์˜ ๋Œ“๊ธ€