๐Ÿฅ๋ถ„๋ฅ˜ ๋ชจ๋ธ ์„ฑ๋Šฅ ์ง€ํ‘œ (Accuracy, ROC, AUC, F1-score)

์žฅ์ฑ„๋ฏผยท2025๋…„ 7์›” 31์ผ

โœ… Accuracy (์ •ํ™•๋„)

์ •ํ™•๋„๋Š” ๋ชจ๋ธ์ด ์ „์ฒด ์˜ˆ์ธก ์ค‘ ์–ผ๋งˆ๋‚˜ ๋งŽ์ด ๋งž์ท„๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค.

  • ์ •์˜:
    Accuracy = (TP + TN) / (TP + TN + FP + FN)

๐Ÿ“š ๊ด€๋ จ ์šฉ์–ด ์ •๋ฆฌ

(์‹ค์ œ) ์–‘์„ฑ(์‹ค์ œ) ์Œ์„ฑ
(์˜ˆ์ธก) ์–‘์„ฑTPFP
(์˜ˆ์ธก) ์Œ์„ฑFNTN

  • TN (True Negative): ์‹ค์ œ๋กœ ๊ฑฐ์ง“์ด๊ณ  ์˜ˆ์ธก๋„ ๊ฑฐ์ง“
  • FP (False Positive): ์‹ค์ œ๋กœ ๊ฑฐ์ง“์ธ๋ฐ ์˜ˆ์ธก์€ ์ฐธ
  • FN (False Negative): ์‹ค์ œ๋กœ ์ฐธ์ธ๋ฐ ์˜ˆ์ธก์€ ๊ฑฐ์ง“
  • TP (True Positive): ์‹ค์ œ๋กœ ์ฐธ์ด๊ณ  ์˜ˆ์ธก๋„ ์ฐธ

๐Ÿ‘‰๋งž์ท„์œผ๋ฉด true, ํ‹€๋ ธ์œผ๋ฉด false / ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ–ˆ์œผ๋ฉด positive, ์Œ์„ฑ์œผ๋กœ ์˜ˆ์ธกํ–ˆ์œผ๋ฉด negative


  • ๋ฏผ๊ฐ๋„ (Sensitivity, TPR, Recall): ์‹ค์ œ ์–‘์„ฑ์„ ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•œ ๋น„์œจ, TP / (TP + FN)
  • ํŠน์ด๋„(Specificity, TNR, True Negative rate): ์‹ค์ œ Negative(์Œ์„ฑ)์„ Negative(์Œ์„ฑ)์ด๋ผ๊ณ  ์˜ˆ์ธกํ•˜๋Š” ๋น„์œจ
  • ROC Curve : FPR vs TPR ๊ทธ๋ž˜ํ”„
  • FPR (False Positive Rate) = = FP / (FP + TN) = 1-specificity
    -> ์‹ค์ œ ์Œ์„ฑ์„ ์–‘์„ฑ์œผ๋กœ ์˜ˆ์ธกํ•œ ๋น„์œจ

  • AUC : ROC ๊ณก์„  ์•„๋ž˜ ๋ฉด์  (์„ฑ๋Šฅ ์ข…ํ•ฉ ์ง€ํ‘œ)
  • F1-Score = 2 (Precision Recall) / (Precision + Recall)
    -> Precision(์ •๋ฐ€๋„)์™€ Recall(์žฌํ˜„์œจ)์˜ ์กฐํ™”ํ‰๊ท  ๊ฐ’
    -> Precision ๋˜๋Š” Recall ์ค‘ ํ•˜๋‚˜๊ฐ€ ๋‚ฎ์œผ๋ฉด F1๋„ ๋‚ฎ์•„์ง



๐Ÿ“ˆ ROC Curve๋ž€?

์ด๋ฏธ์ง€ ์ถœ์ฒ˜: https://bioinformaticsandme.tistory.com/328

  • ๊ฐ€๋กœ์ถ• (x์ถ•): False Positive Rate (FPR)
  • ์„ธ๋กœ์ถ• (y์ถ•): True Positive Rate (TPR)
  • ๋ชจ๋ธ์˜ threshold(์ž„๊ณ„๊ฐ’) ๋ฅผ ๋ณ€ํ™”์‹œํ‚ค๋ฉด์„œ FPR๊ณผ TPR์„ ๊ทธ๋ฆฐ ๊ณก์„ 
  • ์ข‹์€ ๋ชจ๋ธ์ผ์ˆ˜๋ก ROC ๊ณก์„ ์ด ์™ผ์ชฝ ์œ„ ๋ชจ์„œ๋ฆฌ์— ๊ฐ€๊นŒ์›€

๐ŸŒท AUC๋ž€?

์ด๋ฏธ์ง€ ์ถœ์ฒ˜: https://angeloyeo.github.io/2020/08/05/ROC.html

AUC(Area Under the ROC Curve)๋Š” ROC Curve ์•„๋ž˜ ๋ฉด์ ์„ ์˜๋ฏธํ•จ


๐Ÿง  AUC์˜ ํ•ด์„

AUC ๊ฐ’์˜๋ฏธ
1.0์™„๋ฒฝํ•œ ๋ถ„๋ฅ˜๊ธฐ (๋ชจ๋“  ์ƒ˜ํ”Œ์„ 100% ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜)
0.9~1.0๋งค์šฐ ์ข‹์€ ๋ถ„๋ฅ˜๊ธฐ
0.7~0.9๊ดœ์ฐฎ์€ ๋ถ„๋ฅ˜๊ธฐ
0.5๋ฌด์ž‘์œ„ ์ถ”์ธก๊ณผ ๋™์ผ (๋žœ๋ค guessing ์ˆ˜์ค€)
< 0.5์˜ˆ์ธก์ด ์˜คํžˆ๋ ค ๋ฐ˜๋Œ€๋กœ ์ž‘๋™ (๋‚˜์œ ๋ถ„๋ฅ˜๊ธฐ)

์˜ˆ๋ฅผ ๋“ค์–ด AUC = 0.85๋ผ๋Š” ๊ฑด,

๋ฌด์ž‘์œ„๋กœ ๋ฝ‘์€ ์–‘์„ฑ ์ƒ˜ํ”Œ์ด ์Œ์„ฑ ์ƒ˜ํ”Œ๋ณด๋‹ค ๋” ๋†’์€ ํ™•๋ฅ ๋กœ "์–‘์„ฑ"์ด๋ผ๊ณ  ๋ถ„๋ฅ˜๋  ๊ฐ€๋Šฅ์„ฑ์ด 85% ๋ผ๋Š” ๋œป


โœ… AUC์˜ ์žฅ์ 

  • ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜• ๋ฌธ์ œ์— ์ƒ๋Œ€์ ์œผ๋กœ ๊ฐ•ํ•จ
  • ์—ฌ๋Ÿฌ threshold์— ๋Œ€ํ•ด ์ข…ํ•ฉ์ ์ธ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์Œ
  • Precision, Recall, Accuracy๋งŒ์œผ๋กœ๋Š” ์•Œ๊ธฐ ์–ด๋ ค์šด ๋ชจ๋ธ์˜ ์ „์ฒด์ ์ธ ๋ถ„๋ฅ˜ ๋Šฅ๋ ฅ์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Œ

0๊ฐœ์˜ ๋Œ“๊ธ€