혼공 ML+DL #6

myeong·2022년 9월 20일

혼자 공부하는 머신러닝+딥러닝

ML+DL

목록 보기

4/23

📌 회귀

k-최근접 이웃 분류 vs k-최근접 이웃 회귀
분류는 다수결 채택, 회귀는 target 값의 평균

📍 Data Set

사이킷런 train_test_split()
회귀에서는 적절히 섞는 stratify 필요 X

from sklearn.model_selection import train_test_split

train_input, test_input, train_target, test_target = train_test_split(
    perch_length, perch_weight, random_state=42
)

이전과 다르게 입력, 출력 리스트가 1차원 넘파이 배열임
샘플 x 특성 2차원 리스트로 변경
-> reshape(-1, 1) : 열이 1개인 2차원 배열로

train_input = train_input.reshape(-1, 1)
test_input = test_input.reshape(-1, 1)

📍 회귀 모델 훈련

KNeighborsRegressor()
회귀의 score = 결정계수 R²
R² = 1 - { Σ(타깃 - 예측)² / Σ(타깃 - 평균)² }
예측->평균 👉 R² -> 0 bad
예측->타깃 👉 R² -> 1 good

from sklearn.neighbors import KNeighborsRegressor

knr = KNeighborsRegressor()
knr.fit(train_input, train_target)

knr.score(test_input, test_target)

0.992809406101064 -> good

|예측-타깃| 오차의 평균

from sklearn.metrics import mean_absolute_error

test_prediction = knr.predict(test_input)
mae = mean_absolute_error(test_target, test_prediction)
print(mae)