โ‘ซ ๐Ÿค– Machine Learning 3์ผ์ฐจ - ๋ถ„๋ฅ˜๊ฒ€์ •

JItzelยท2025๋…„ 12์›” 13์ผ

๐Ÿก Machine_learning

๋ชฉ๋ก ๋ณด๊ธฐ
12/14
post-thumbnail

๋ถ„๋ฅ˜ ์„ฑ๋Šฅ ํ‰๊ฐ€์ง€ํ‘œ (Confusion Matrix, F1-Score)

๋ถ„๋ฅ˜ ๋ชจ๋ธ ํ‰๊ฐ€์‹œ ์ •ํ™•๋„ 99%๊ฐ€ ์ง„์ •ํ•œ ๋ชจ๋ธ ์„ฑ๋Šฅ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์„๊นŒ?
์•” ํ™˜์ž ์ง„๋‹จ์ด๋‚˜ ๋ถˆ๋Ÿ‰ํ’ˆ ๊ฒ€์ถœ์ฒ˜๋Ÿผ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•œ ๊ฒฝ์šฐ, ์ •ํ™•๋„ ์™ธ์— ์ •๋ฐ€๋„(Precision), ์žฌํ˜„์œจ(Recall), F1-Score๋ฅผ ๋ฐ˜๋“œ์‹œ ํ™•์ธํ•ด์•ผ ํ•œ๋‹ค.

1. ํ˜ผ๋™ ํ–‰๋ ฌ (Confusion Matrix)

  • ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฐ’๊ณผ ์‹ค์ œ ์ •๋‹ต์ด ์–ผ๋งˆ๋‚˜ ์ผ์น˜ํ•˜๋Š”์ง€๋ฅผ ํ‘œ๋กœ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ
๊ตฌ๋ถ„์˜ˆ์ธก: Positive (1)์˜ˆ์ธก: Negative (0)
์‹ค์ œ: Positive (1)TP (True Positive) : ์ •ํƒ (1์„ 1๋กœ ์ž˜ ๋งž์ถค)FN (False Negative) : ๋ฏธํƒ (1์ธ๋ฐ 0์ด๋ผ๊ณ  ๋†“์นจ)
์‹ค์ œ: Negative (0)FP (False Positive) : ์˜คํƒ (0์ธ๋ฐ 1์ด๋ผ๊ณ  ์ž˜๋ชป ์šฐ๊น€)TN (True Negative) : ์ •ํƒ (0์„ 0์œผ๋กœ ์ž˜ ๋งž์ถค)

4๊ฐ€์ง€ ๊ธฐ๋ณธ ์š”์†Œ
์•ž๊ธ€์ž (T/F): ๋งž์ท„๋‹ˆ? (True/False)
๋’ท๊ธ€์ž (P/N): ๋ญ๋ผ๊ณ  ์˜ˆ์ธกํ–ˆ๋‹ˆ? (Positive/Negative)

2. ์ฃผ์š” ํ‰๊ฐ€์ง€ํ‘œ 3๋Œ€์žฅ

1) ์ •ํ™•๋„ (Accuracy)

  • ๊ณต์‹: TP+TNTotal\frac{TP + TN}{Total}
  • ์˜๋ฏธ: ์ „์ฒด ๋ฐ์ดํ„ฐ ์ค‘ ๋งž๊ฒŒ ์˜ˆ์ธกํ•œ ๋น„์œจ.
  • ํ•œ๊ณ„: ๋ถˆ๊ท ํ˜•ํ•œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๋ฌด์šฉ์ง€๋ฌผ
    ์˜ˆ) ์•” ํ™˜์ž๊ฐ€ 100๋ช… ์ค‘ 1๋ช…๋ฟ์ผ ๋•Œ, ๋ฌด์กฐ๊ฑด "์ •์ƒ"์ด๋ผ๊ณ ๋งŒ ์ฐ์–ด๋„ ์ •ํ™•๋„๋Š” 99%๊ฐ€ ๋‚˜์˜จ๋‹ค.

2) ์ •๋ฐ€๋„ (Precision)

  • ๊ณต์‹: TPTP+FP\frac{TP}{TP + FP}
  • ์˜๋ฏธ: ๋ชจ๋ธ์ด "์–‘์„ฑ(1)์ด์•ผ!"๋ผ๊ณ  ์˜ˆ์ธกํ•œ ๊ฒƒ ์ค‘ ์‹ค์ œ ์–‘์„ฑ์˜ ๋น„์œจ
  • ์ค‘์š”ํ•œ ๊ฒฝ์šฐ: FP(์˜คํƒ)๋ฅผ ์ค„์—ฌ์•ผ ํ•  ๋•Œ
    ์˜ˆ) ์ŠคํŒธ ๋ฉ”์ผ ๋ถ„๋ฅ˜ (์ผ๋ฐ˜ ๋ฉ”์ผ์„ ์ŠคํŒธ์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋ฉด ํฐ์ผ ๋‚จ)

3) ์žฌํ˜„์œจ (Recall)

  • ๊ณต์‹: TPTP+FN\frac{TP}{TP + FN}
  • ์˜๋ฏธ: ์‹ค์ œ ์–‘์„ฑ(1)์ธ ๋ฐ์ดํ„ฐ ์ค‘ ๋ชจ๋ธ์ด ๋†“์น˜์ง€ ์•Š๊ณ  ์ฐพ์€ ๋น„์œจ. (๋ฏผ๊ฐ๋„๋ผ๊ณ ๋„ ํ•จ)
  • ์ค‘์š”ํ•œ ๊ฒฝ์šฐ: FN(๋ฏธํƒ, ๋†“์นจ)์„ ์ค„์—ฌ์•ผ ํ•  ๋•Œ
    ์˜ˆ) ์•” ํ™˜์ž ์ง„๋‹จ (์•” ํ™˜์ž๋ฅผ ์ •์ƒ์œผ๋กœ ์ง„๋‹จํ•˜๋ฉด ์ƒ๋ช…์ด ์œ„ํ—˜ํ•จ)

3. F1-Score (์กฐํ™”ํ‰๊ท ์˜ ๋งˆ๋ฒ•)

  • ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถˆ๊ท ํ˜•ํ•  ๋•Œ ๊ฐ€์žฅ ์„ ํ˜ธ๋˜๋Š” ์ง€ํ‘œ. ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์ด ์–ด๋А ํ•œ์ชฝ์œผ๋กœ ์น˜์šฐ์น˜์ง€ ์•Š๊ณ  ๊ท ํ˜•์„ ์ด๋ฃฐ ๋•Œ ๋†’์€ ๊ฐ’

์™œ '์‚ฐ์ˆ ํ‰๊ท '์ด ์•„๋‹ˆ๋ผ '์กฐํ™”ํ‰๊ท '์ผ๊นŒ?
F1=2ร—Precisionร—RecallPrecision+RecallF1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}
์กฐํ™”ํ‰๊ท (Harmonic Mean)์€ ์—ญ์ˆ˜์˜ ์‚ฐ์ˆ ํ‰๊ท ์ด๋‹ค.
๋น„์œจ์ด๋‚˜ ์†๋„์ฒ˜๋Ÿผ ํŽธ์ฐจ๊ฐ€ ํฐ ๊ฐ’๋“ค์˜ ํ‰๊ท ์„ ๊ตฌํ•  ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค.

  • ์˜ˆ์‹œ (ํ‰๊ท  ์†๋„) ๐Ÿš—
    ๊ฐˆ ๋•Œ 100km/h, ์˜ฌ ๋•Œ 0km/h๋ผ๋ฉด?
    ์‚ฐ์ˆ ํ‰๊ท : 50km/h (์ค‘๊ฐ„๊ฐ’) โ†’\rightarrow ํ•˜์ง€๋งŒ ์‹ค์ œ๋กœ๋Š” ์˜์›ํžˆ ๋ชป ๋Œ์•„์™”์œผ๋ฏ€๋กœ ์†๋„๋Š” 0์ด์–ด์•ผ ํ•œ๋‹ค.
    ์กฐํ™”ํ‰๊ท : 0km/h โ†’\rightarrow ์ž‘์€ ๊ฐ’์— ๊ฐ€์ค‘์น˜๋ฅผ ๋‘์–ด ํŽ˜๋„ํ‹ฐ๋ฅผ ํฌ๊ฒŒ ๋ถ€์—ฌํ•œ๋‹ค.

  • F1-Score์˜ ํŠน์ง•: ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ ์ค‘ ํ•˜๋‚˜๋ผ๋„ 0์— ๊ฐ€๊นŒ์šฐ๋ฉด ์ ์ˆ˜๊ฐ€ ํ™• ๋–จ์–ด์ง.
    ๋”ฐ๋ผ์„œ ๋‘ ์ง€ํ‘œ๋ฅผ ๊ณจ๊ณ ๋ฃจ ์ž˜ ์ฑ™๊ฒจ์•ผ ๋†’์€ ์ ์ˆ˜๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๋‹ค.

์›๋ฆฌ ์ดํ•ด

import numpy as np

# ๊ฐ€์ƒ์˜ ํ˜ผ๋™ ํ–‰๋ ฌ ๋ฐ์ดํ„ฐ
TP = 50   # ์•” ํ™˜์ž๋ฅผ ์•”์ด๋ผ๊ณ  ๋งž์ถค
TN = 40   # ์ •์ƒ์„ ์ •์ƒ์ด๋ผ๊ณ  ๋งž์ถค
FP = 10   # ์ •์ƒ์„ ์•”์ด๋ผ๊ณ  ์ž˜๋ชป ์˜ˆ์ธก (์˜คํƒ)
FN = 5    # ์•” ํ™˜์ž๋ฅผ ์ •์ƒ์ด๋ผ๊ณ  ๋†“์นจ (๋ฏธํƒ, ์œ„ํ—˜!)

total = TP + TN + FP + FN

# 1. ์ •ํ™•๋„ (Accuracy)
accuracy = (TP + TN) / total

# 2. ์ •๋ฐ€๋„ (Precision) : ์˜ˆ์ธก(P) ๋ถ„๋ชจ
precision = TP / (TP + FP)

# 3. ์žฌํ˜„์œจ (Recall) : ์‹ค์ œ(P) ๋ถ„๋ชจ
recall = TP / (TP + FN)

# 4. F1-score (์กฐํ™”ํ‰๊ท )
f1 = 2 * (precision * recall) / (precision + recall)

print(f"์ •ํ™•๋„ (Accuracy): {accuracy:.3f}")   # 0.857
print(f"์ •๋ฐ€๋„ (Precision): {precision:.3f}") # 0.833
print(f"์žฌํ˜„์œจ (Recall):    {recall:.3f}")    # 0.909
print(f"F1-score:           {f1:.3f}")        # 0.870

Scikit-learn ํ™œ์šฉ (Pima Indians Diabetes)

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd

# ๋ฐ์ดํ„ฐ ์ค€๋น„ (๊ฐ€์ •)
# x_train, x_test, y_train, y_test = train_test_split(...)
# model.fit(x_train, y_train)

# ์˜ˆ์ธก ์ˆ˜ํ–‰
pred = model.predict(x_test) # ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ์˜ˆ์ธก

# ํ‰๊ฐ€ ์ง€ํ‘œ ์ถœ๋ ฅ
print('์ •ํ™•๋„ (Accuracy) : ', accuracy_score(y_test, pred))
print('์ •๋ฐ€๋„ (Precision): ', precision_score(y_test, pred))
print('์žฌํ˜„์œจ (Recall)   : ', recall_score(y_test, pred))
print('F1 Score          : ', f1_score(y_test, pred))

# ์‹คํ–‰ ๊ฒฐ๊ณผ ์˜ˆ์‹œ
# ์ •ํ™•๋„    :  0.78125
# ์ •๋ฐ€๋„    :  0.7358
# ์žฌํ˜„์œจ    :  0.5820  <-- ์žฌํ˜„์œจ์ด ์ƒ๋Œ€์ ์œผ๋กœ ๋‚ฎ์Œ (์‹ค์ œ ํ™˜์ž๋ฅผ ๋งŽ์ด ๋†“์นจ)
# F1 Score  :  0.6500  <-- ์žฌํ˜„์œจ ๋•Œ๋ฌธ์— F1 ์ ์ˆ˜๋„ ๋‚ฎ์•„์ง

์š”์•ฝ

  • Confusion Matrix: ๋ชจ๋ธ์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ์ •ํƒ/์˜คํƒ/๋ฏธํƒ์œผ๋กœ ๋‚˜๋ˆ„์–ด ๋ถ„์„ํ•œ๋‹ค.
  • ์ •ํ™•๋„(Accuracy)๋งŒ ๋ฏฟ์œผ๋ฉด ์•ˆ ๋œ๋‹ค. (ํŠนํžˆ ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์—์„œ ์œ„ํ—˜)
  • ์ •๋ฐ€๋„(Precision)๋Š” ๋ชจ๋ธ์ด ์ฐ์€ ๊ฒƒ์˜ ์ •๋‹ต๋ฅ , ์žฌํ˜„์œจ(Recall)์€ ์‹ค์ œ ์ •๋‹ต์„ ์ฐพ์•„๋‚ธ ๋น„์œจ์ด๋‹ค.
  • F1-Score๋Š” ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ์กฐํ™”ํ‰๊ท ์œผ๋กœ, ๋‘ ์ง€ํ‘œ์˜ ๊ท ํ˜•์„ ํ‰๊ฐ€ํ•˜๋Š” ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์ฒ™๋„
profile
์†Œ๊ธˆ์— ์ ˆ์ธ ์ƒ์„ , ๋ชธ์„ ๋’ค์ฒ™์ด๋‹ค ๐ŸŸ

0๊ฐœ์˜ ๋Œ“๊ธ€