[๋…ผ๋ฌธ๋ฆฌ๋ทฐ] ๐Ÿ“ ArcFace: Additive Angular Margin Loss for Deep Face Recognition

๊ฐ•์ฝฉ์ฝฉยท2022๋…„ 3์›” 30์ผ
3

deep learning papers

๋ชฉ๋ก ๋ณด๊ธฐ
1/1

๐ŸŽƒ ์•ˆ๋…•ํ•˜์„ธ์š”! ์˜ค๋Š˜์€ ์ตœ๊ทผ์— ํฅ๋ฏธ์žˆ๊ฒŒ ์ฝ์€ ๋…ผ๋ฌธ์„ ๊ฐ€๋ณ๊ฒŒ ์ •๋ฆฌํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
๐Ÿงจ ๊ณต๋ถ€์šฉ์œผ๋กœ ๋‚จ๊ธฐ๋Š” ๊ฒŒ์‹œ๋ฌผ๋กœ, ์ •๋ฆฌ์— ๋ถ€์กฑํ•œ ๋ถ€๋ถ„์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค :) ์–‘ํ•ด ๋ถ€ํƒ๋“œ๋ ค์š”!

https://arxiv.org/pdf/1801.07698.pdf

๐Ÿ˜Ž ์–ผ๊ตด ์ธ์‹ task ๊ด€๋ จ ์ž๋ฃŒ๋ฅผ ์„œ์นญํ•˜๋‹ค ๋งŒ๋‚˜๊ฒŒ ๋œ ๋…ผ๋ฌธ, ArcFace ์ž…๋‹ˆ๋‹ค! ๊ทธ๋Ÿผ, Let's Diggin!

โœจAbstract

๊ธฐ์กด ๋…ผ๋ฌธ์—์„œ๋Š” ์–ผ๊ตด ์ธ์‹ ๊ฐ™์€ ํƒœ์Šคํฌ๋ฅผ ์ˆ˜ํ–‰ํ•จ์— ์žˆ์–ด์„œ, ์ผ๋ฐ˜์ ์ธ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์™€ ๋™์ผํ•œ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
์ผ๋ฐ˜์ ์ธ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์˜ ๊ฐ„๋žตํ•œ ๋ชจ๋ธ ํ•™์Šต ์ˆœ์„œ๋ฅผ ์ด์•ผ๊ธฐํ•ด๋ณด๋ฉด,
1. ๋ณดํ†ต DCNNs๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” Conv layer๋“ค์„ ๊นŠ๊ฒŒ ์Œ“์€ ๋ฐฉ์‹์œผ๋กœ Neural Net Architecture๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ ,
2. ์ด๋ฅผ Pooling ํ•œ ํ›„ MLP Layer๋ฅผ ๊ฑฐ์ณ logit์„ ๋ฝ‘์•„๋ƒ…๋‹ˆ๋‹ค.
3. logit์€ softmax ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์ณ Probability๊ฐ€ ๋˜๊ณ ,
4. ๋ณดํ†ต ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋Š” Cross Entropy ํ•จ์ˆ˜๋ฅผ loss๋กœ ํ•˜์—ฌ Neural Net ํ•™์Šต์ด ์ง„ํ–‰์ด ๋ฉ๋‹ˆ๋‹ค.

๐Ÿ˜ƒ ๊ทธ๋Ÿฐ๋ฐ! ์–ผ๊ตด ์ธ์‹ ํƒœ์Šคํฌ๋Š” ์ผ๋ฐ˜์ ์ธ ๋ฌธ์ œ์™€๋Š” ์กฐ๊ธˆ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.
๐Ÿคฃ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ผ๋ฒจ๋‹น ๋ช‡ ์žฅ ์—†๊ณ  (์‚ฌ๋žŒ ์–ผ๊ตด์ผํ…Œ๋‹ˆ๊นŒ์š”), ๋ ˆ์ด๋ธ”๋„ ๋ฌด์ง€๋ง‰์ง€ํ•˜๊ฒŒ ๋งŽ์Šต๋‹ˆ๋‹ค. (ํ•™์Šต์— ์‚ฌ์šฉํ•œ ์‚ฌ๋žŒ ๋ช…์ˆ˜)

๐Ÿค” ์ด์— ๋Œ€ํ•ด์„œ, ์ €์ž๋Š” "Additive Angular Margin Loss" ๋ผ๋Š” ๊ฐœ๋…์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค!
๐Ÿ˜‹ logit์„ "๊ฐ๋„ ๊ฐœ๋…์„ ํ™œ์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ํ•˜๊ฒŒ ๋˜๋Š”" ๋ฐฉ์‹์œผ๋กœ ๋ณ€๊ฒฝํ•˜๋ฉฐ, ์ด๋Š” ๊ธฐ์กด์˜ ๋ฐฉ์‹๋ณด๋‹ค ๊ฐ class๊ฐ„ boundary ํ˜•์„ฑ์„ ๋ชจ๋ธ์ด ๋” ์ž˜ ํ•™์Šตํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค๊ณ  ์ด์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

โœ”+alpha

์˜ค๋Š˜ ์†Œ๊ฐœํ•˜๋Š” ๋…ผ๋ฌธ์—์„œ ์ธ์šฉํ•œ, "SphereFace" ๋ผ๋Š” ์„ ํ–‰ ์—ฐ๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

https://arxiv.org/abs/1704.08063

ํ•ด๋‹น ๋…ผ๋ฌธ์—์„œ ์ฃผ์žฅํ•˜๋Š” ๋‚ด์šฉ์ธ,
"Neural Net์˜ ๋งˆ์ง€๋ง‰ fully connected layer์˜ Weight Matrix๋Š”, ๊ฐ class์˜ ์ค‘์‹ฌ ์ง€ํ‘œ๋กœ ํ‘œํ˜„๋  ์ˆ˜ ์žˆ๋‹ค." ๋ผ๋Š” ๋‚ด์šฉ์„ ์ˆ™์ง€ํ•˜๊ณ  ๊ฐ€์‹œ๋ฉด, ์ถ”ํ›„ ์ดํ•ด๊ฐ€ ํ›จ์”ฌ ์‰ฌ์šธ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค :)

์กฐ๊ธˆ ๋” ์„ค๋ช…ํ•˜์ž๋ฉด,

์œ„ ๊ทธ๋ฆผ์˜ "Normalized Weight"์˜ row๋“ค์ด ๊ฐ class์˜ ์ค‘์‹ฌ์„ ๋‚˜ํƒ€๋‚ด๋Š” vector๋กœ ๊ฐ„์ฃผํ•œ๋‹ค๋Š” ๋‰˜์•™์Šค๋กœ ์ดํ•ดํ•˜์‹œ๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค!

๐Ÿ‘Introduction

๐Ÿ˜ ArcFace์˜ ์žฅ์ ๋“ค์„ ์„ค๋ช…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๐Ÿ˜Ž ๊ฒฐ๊ตญ์€ ์ € ์œ„์˜ Margin-Loss๊ฐ€ ์ข‹๊ณ , ๊ทธ๊ฑธ ์‚ฌ์šฉํ•œ๊ฒŒ ArcFace! ๋ผ๋Š” ์„ค๋ช…์ด ์ฃผ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

angular margin penalty๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด, ๋” ์ข‹์€ ํšจ๊ณผ๋ฅผ ๋ณด์—ฌ์ค€๋‹ค๊ณ  18, 19 ๋…ผ๋ฌธ์— ๋‚˜์™€์žˆ๋‹ค๊ณ  ํ•˜๋„ค์š”...๐Ÿ˜

18๋ฒˆ : https://arxiv.org/pdf/1704.08063.pdf (SphereFace)
19๋ฒˆ : https://arxiv.org/pdf/1612.02295.pdf

๐Ÿ‘ ์–ด์จŒ๋“  ๋‚ด์ ์„ ๊ณ„์‚ฐํ•จ์— ์žˆ์–ด ์ถ”๊ฐ€๋กœ ๊ฐ๋„๋ฅผ ๋” ๋ฒŒ๋ฆฐ ์ฑ„๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•จ์œผ๋กœ์จ
๊ฐ๋„๊ฐ€ ๊ฐ€๊นŒ์›Œ์งˆ ๋†ˆ์€ ๋” ๊ฐ€๊นŒ์›Œ์ง€๊ฒŒ, ๊ฐ๋„๊ฐ€ ๋ฉ€์–ด์งˆ ๋†ˆ์€ ๋” ๋ฉ€์–ด์ง€๊ฒŒ ํ•˜๋ ค๋Š” ์ปจ์…‰์œผ๋กœ ์ดํ•ด๋ฉ๋‹ˆ๋‹ค.

๐Ÿฑโ€๐Ÿ‘ค Proposed Approach : ๐Ÿ“ ArcFace

๐Ÿคฆโ€โ™‚๏ธ softmax & cross entropy

๊ธฐ์กด softmax๋ฅผ ํ™œ์šฉํ•œ loss function์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ , ์–ผ๊ตด ์ธ์‹ task์—๋„ ๋„๋ฆฌ ์“ฐ์ด๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜, ์ €์ž๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

However,
the softmax loss function does not explicitly optimise the feature embedding to enforce higher similarity for intra-class samples and diversity for inter-class samples, which results in a performance gap for deep face recognition under large intra-class appearance variations (e.g. pose variations and age gaps)

๐Ÿ˜ ์ฆ‰, softmax function์„ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ๊ฐ™์€ ํด๋ž˜์Šค ๋‚ด์—์„œ(for intra-class)๋Š” ๋น„์Šทํ•˜๊ฒŒ Embedding์ด ํ˜•์„ฑ๋˜๊ณ , ๋‹ค๋ฅธ ํด๋ž˜์Šค ๋ผ๋ฆฌ(for inter-class)๋Š” ๋‹ค๋ฅด๊ฒŒ Embedding์ด ํ˜•์„ฑ๋œ๋‹ค๋Š” ๋ง์ž…๋‹ˆ๋‹ค.
๐Ÿค” ๊ทธ๋Ÿฌ๋‚˜, ์–ผ๊ตด ์ธ์‹ task ๊ฐ™์€ ๊ฒฝ์šฐ ์‹ค์ œ ์‚ฌ๋žŒ์€ ๋‹ค๋ฅด์ง€๋งŒ, "์‚ฌ๋žŒ" ๋“ค๋กœ๋งŒ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹์ด๊ธฐ ๋•Œ๋ฌธ์— intra-class์™€ ๊ฐ™์€ ์„ฑ์งˆ์„ ๋„๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
๐Ÿ˜ข ์œ„์˜ ์ด์œ ๋•Œ๋ฌธ์— ์„ฑ๋Šฅ ์ €ํ•˜(performance gap)์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ด์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ˜Ž Suggestion : new function

์œ„์˜ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด bias๋Š” ์—†์• ๊ณ , ๋ฒกํ„ฐ๊ฐ„ ๊ฐ๋„์— ์ง‘์ค‘ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ฮธj\theta_j๋Š” Weight์ธ WjW_j์™€ feature์ธ xix_i์˜ ๊ฐ๋„์ž…๋‹ˆ๋‹ค.
Abstract์—์„œ ๋ง์”€๋“œ๋ ธ๋˜ Weight๋ฅผ ๊ฐ class์˜ centre๊ฐ’์œผ๋กœ ์ƒ๊ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์ผ์ข…์˜ ๊ธฐ์ค€์ด ๋˜๋Š” ๊ฒƒ์ด์ง€์š”.
์ฆ‰, ํ•ด๋‹น ๊ธฐ์ค€๊ณผ ๊ฐ™์€ class๋ผ๋ฉด ๊ฐ๋„๊ฐ€ ์ž‘์•„์ง€๊ฒŒ, ํ•ด๋‹น ๊ธฐ์ค€๊ณผ ๋‹ค๋ฅธ class ๋ผ๋ฉด ๊ฐ๋„๊ฐ€ ๋ฉ€์–ด์ง€๊ฒŒ ํ•˜๋ ค๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

๐Ÿ‘ ์—ฌ๊ธฐ์—์„œ โˆฅWjโˆฅ\parallel W_j \parallel ๊ฐ’์€ l2l_2 norm์— ์˜ํ•ด 1๋กœ ์ œํ•œ๋˜์–ด ์žˆ๊ณ , โˆฅxiโˆฅ\parallel x_i \parallel ๋˜ํ•œ l2l_2 norm์œผ๋กœ ์ œํ•œ๋˜์–ด ์žˆ์œผ๋ฉฐ scale factor ss ๋กœ re-scale ๋ฉ๋‹ˆ๋‹ค.
๐Ÿ˜Ž ์ฆ‰, ๋ชจ๋ธ ํ•™์Šต์—๋Š” ์ •๋ง "๊ฐ๋„๋กœ" ๋ถ„๋ฅ˜๊ฐ€ ๋˜๋„๋ก ํ•™์Šต์ด ์ง„ํ–‰๋œ๋‹ค๋Š” ๊ฒƒ์ด์ง€์š”.

๐Ÿ˜‰ Train !

๐Ÿ“ ์œ„์˜ ๊ฐœ๋…์„ ์ ์šฉํ•˜์—ฌ ArcFace loss๋ฅผ ํ™œ์šฉํ•œ ํ•™์Šต ์ˆœ์„œ๋ฅผ ๋‚˜์—ดํ•˜๋ฉด,

  1. feature xix_i์™€ WW์˜ cosine ์œ ์‚ฌ๋„๋ฅผ ๊ตฌํ•˜์—ฌ, cosฮธjcos\theta_j๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
  2. arccosarccos ํ•จ์ˆ˜๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋‘ ๋ฒกํ„ฐ์˜ ๊ฐ๋„์ธ ฮธyi\theta_{yi}๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
  3. ฮธyi\theta_{yi} ์— angular margin penalty mm์„ ๋”ํ•˜์—ฌ cos(ฮธyi+m)cos(\theta_{yi}+m)์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
  4. scale factor ss๋ฅผ ๊ณฑํ•ด์ฃผ๊ณ , softmax๋ฅผ ์ทจํ•ฉ๋‹ˆ๋‹ค.
  5. cross entropy loss๋ฅผ ๊ตฌํ•˜๊ณ , backpropํ•˜์—ฌ Neural Net์„ ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌํ•˜์—ฌ ์•„๋ž˜์™€ ๊ฐ™์€ Loss Function์ด ์ •๋ฆฌ๋ฉ๋‹ˆ๋‹ค!

๐Ÿ˜† ๊ทธ๋ฆฌ๊ณ , ๋…ผ๋ฌธ์—์„œ๋Š” ์ผ๋ฐ˜ softmax๋ณด๋‹ค ArcFace๋ฅผ ์‚ฌ์šฉํ–ˆ์„ ๋•Œ, ๋น„์Šทํ•œ class๋ผ๋ฆฌ์˜ ๋ถ„๋ช…ํ•œ gap์„ ๋ช…๋ฐฑํžˆ ๋ณด์—ฌ์ค€๋‹ค๊ณ  ์ด์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

์ด๋กœ์„œ, ์–ด๋Š์ •๋„ ArcFace์˜ ๊ธฐ๋ณธ ๋‚ด์šฉ์€ ์ •๋ฆฌ๊ฐ€ ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํœด์šฐ! ๐Ÿ˜‹

๊ธ€์„ ์ •๋ฆฌํ•˜๋ฉฐ

์˜ค๋Š˜ ์ž‘์„ฑํ•œ ArcFace๋Š” kaggle์„ ๋’ค์ ๊ฑฐ๋ฆฌ๋‹ค๊ฐ€ ๋งˆ์ฃผ์น˜๊ฒŒ ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ธ๋ฐ, ์ •๋ง ์œ ๋ช…ํ•˜๊ณ  ์ด๋ฏธ ๋งŽ์ด ์“ฐ๊ณ  ๊ณ„์…จ๋„ค์š” :)

Happy Whale Competiton์„ ๋ณด๋‹ค๊ฐ€ ๊ณ ์ˆ˜๋ถ„์˜ ๋…ธํŠธ๋ถ์—์„œ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ๋˜์—ˆ๋‹ต๋‹ˆ๋‹ค๐Ÿ‘

https://www.kaggle.com/competitions/happy-whale-and-dolphin

๊ทธ๋Ÿฐ๋ฐ ์˜ค๋Š˜ ๊ธฐ์ค€์œผ๋กœ ์ตœ์ข…์ œ์ถœ๊นŒ์ง€ 12์ผ ๋‚จ์•˜๋Š”๋ฐ ์ด๊ฑฐ ๊ฐ€๋Šฅํ•œ๊ฑธ๊นŒ์š”...? ๐Ÿคฃ๐Ÿคฃ๐Ÿคฃ๐Ÿคฃ
๋ง‰ํŒ ์ŠคํผํŠธ ๋‚ด์–ด์„œ ์ œ์ถœ์„ ๋ชฉํ‘œ๋กœ ํ•˜๋Š”๊ฒƒ์œผ๋กœ ํ•ด๋ด์•ผ๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ์—๋Š”, ๊ฐ„๋‹จํžˆ ๊ตฌํ˜„์ฒด๋ฅผ ์‚ฌ์šฉํ•œ ์ฝ”๋“œ๋ฅผ ๊ณต์œ ํ•˜๋Š” ๊ฒƒ๋„ ๊ดœ์ฐฎ์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๐Ÿ˜
๊ทธ๋Ÿผ, ์ข‹์€ ํ•˜๋ฃจ ๋˜์„ธ์š”!

profile
MLOps, ML Engineer. ๋ฐ์ดํ„ฐ์—์„œ ์‹œ์Šคํ…œ์œผ๋กœ, ์‹œ์Šคํ…œ์—์„œ ๊ฐ€์น˜๋กœ.

0๊ฐœ์˜ ๋Œ“๊ธ€