모델평가 & 수학기초

InSung-Na·2023년 3월 18일
0

Part 09. Machine Learning

목록 보기
5/13
post-thumbnail

해당 글은 제로베이스데이터스쿨 학습자료를 참고하여 작성되었습니다

분류모델평가

모델 생성 과정

분류 모델의 평가 항목

이진 분류 모델의 평가

Acuuracy(정확도)

Precision(정밀도)

Recall(재현율, TPR)

Fall-Out(FPR)

정리


F1-Score

ROC와 AUC

  • ROC(Receiver Operating Characteristic) Curve: FPR과 TPR의 곡선

  • AUC(Area Under the ROC Curve) : ROC 곡선 아래 영역

와인데이터로 평가하기

와인 데이터 가져오기
import pandas as pd

red_url = 'https://raw.githubusercontent.com/Pinkwink/ML_tutorial/master/dataset/winequality-red.csv'

white_url = 'https://raw.githubusercontent.com/Pinkwink/ML_tutorial/master/dataset/winequality-white.csv'

red_wine = pd.read_csv(red_url, sep=';')
white_wine = pd.read_csv(white_url, sep=';')

red_wine['color'] = 1.
white_wine['color'] = 0.

wine = pd.concat([red_wine, white_wine])
wine['taste'] = [1. if grade>5 else 0. for grade in wine['quality']]

X = wine.drop(['taste', 'quality'], axis=1)
y = wine['taste']

의사결정나무 학습 및 평가

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=13)
wine_tree = DecisionTreeClassifier(max_depth=2, random_state=13)
wine_tree.fit(X_train, y_train)

y_pred_tr = wine_tree.predict(X_train)
y_pred_test = wine_tree.predict(X_test)

print('Train Acc : ', accuracy_score(y_train, y_pred_tr))
print('Test Acc : ', accuracy_score(y_test, y_pred_test))
----------------------------------------------------------
Train Acc :  0.7294593034442948
Test Acc :  0.7161538461538461

분류 모델 평가

from sklearn.metrics import accuracy_score, precision_score, recall_score
from sklearn.metrics import f1_score, roc_auc_score, roc_curve

print('Accuracy : ', accuracy_score(y_test, y_pred_test))
print('Recall : ', recall_score(y_test, y_pred_test))
print('Precision : ', precision_score(y_test, y_pred_test))
print('ROC AUC Score : ', roc_auc_score(y_test, y_pred_test))
print('F1 Score : ', f1_score(y_test, y_pred_test))
---------------------------------------------------------------
Accuracy :  0.7161538461538461
Recall :  0.7314702308626975
Precision :  0.8026666666666666
ROC AUC Score :  0.7105988470875331
F1 Score :  0.7654164017800381

ROC 곡선 시각화

import matplotlib.pyplot as plt
%matplotlib inline

pred_proba = wine_tree.predict_proba(X_test)[:, 1]
fpr, tpr, thresholds = roc_curve(y_test, pred_proba)

plt.figure(figsize=(10,8))
plt.plot([0,1], [0,1], '--c')
plt.plot(fpr, tpr, 'r')
plt.title("ROC Curve")
plt.xlabel('FPR')
plt.ylabel("TPR")
plt.grid()
plt.show()


수학기초 함수

다항함수

지수함수

로그함수

시그모이드

Boxplot

0개의 댓글