95일차 시작.... (SVM)

조동현·2022년 11월 23일

SVM 실습 SVM이란?

[교육] Python ML

목록 보기

11/17

📊 SVM

📌 SVM이란?

정의
- 분류와 회귀분석을 위해 주로 사용한다.
- 커널 트릭을 사용
- 저차원의 데이터를 고차원으로 변형(비선형 변환)한 후 분류 및 회귀 분석 처리

📊 SVM 실습

📌 SVM 문제1

1. 라이브러리 Import

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn import svm
from sklearn.metrics import accuracy_score

2. 데이터 준비

data = pd.read_csv('bmi.csv')
data = data[:50000]
print(data.head(2))
#    height  weight   label
# 0     180      69  normal
# 1     192      79  normal

3. feature, label 분리

feature = data.drop(['label'], axis=1)
label = data['label']

4. feature scaling

feature['weight'] = feature['weight'].apply(lambda x: x/100)
feature['height'] = feature['height'].apply(lambda x: x/200)
print(feature.head(2))
#    height  weight
# 0    0.90    0.69
# 1    0.96    0.79

5. label Encoding

label = label.map({'thin':0, 'normal':1, 'fat':2})
print(label.unique())
# [1 2 0]

6. 학습, 테스트 데이터 분리

x_train, x_test, y_train, y_test = train_test_split(feature, label, test_size=0.3, random_state=10)
print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)
# (35000, 2) (15000, 2) (35000,) (15000,)

7. SVM 모델 생성

model = svm.SVC(C=0.01)     # 0 ~ 1 사이의 L2규제 강화
model.fit(x_train, y_train)

8. 예측값, 실제값 비교

y_pred = model.predict(x_test)
print('예측값 : ', y_pred[:10])
print('실제값 : ', y_test[:10])

9. 모델 성능 평가

acc = accuracy_score(y_test, y_pred)
print('모델 예측값 : ', acc)
# 모델 예측값 :  0.9686666666666667

10. 교차검증

cross_vali = cross_val_score(model, feature, label, cv=3)
print('각각의 검증 정확도 : ', cross_vali)
print('평균 검증 정확도 : ', np.mean(cross_vali))
# 각각의 검증 정확도 :  [0.96934061 0.96364073 0.96693868]
# 평균 검증 정확도 :  0.9666400059734315

11. 시각화

label_data = pd.read_csv('bmi.csv', index_col=2)

def scatter_func(label, color):
    b = label_data.loc[label]
    plt.scatter(b['weight'], b['height'], c=color, label=label)

scatter_func('fat', 'red')
scatter_func('normal', 'green')
scatter_func('thin', 'blue')
plt.legend()
plt.show()

12. 예측

new_data = pd.DataFrame({'weight':[60, 55], 'height':[180, 170]})
new_data['weight'] = new_data['weight'].apply(lambda x: x/100)
new_data['height'] = new_data['height'].apply(lambda x: x/200)
new_pred = model.predict(new_data)
print(new_pred)

조동현

데이터 사이언티스트를 목표로 하는 개발자

이전 포스트

95일차 시작.... (XGBoost)

다음 포스트

95일차 시작.... (SVM)

[교육] Python ML

📊 SVM

📌 SVM이란?

📊 SVM 실습

📌 SVM 문제1

95일차 시작.... (XGBoost)

96일차 시작.... (PCA)

0개의 댓글