- GridSearchCV
지정한 하이퍼파라미터를 모두 검증하는 방식 from sklearn.model_selection import GridSearchCV
params = {'max_depth':[None, 1, 2, 3, 4, 5, 6, 7],
'max_leaf_nodes':[3,4,5,6,7,8,9]
}
gs = GridSearchCV(DecisionTreeClassifier(random_state=0,
param_grid=params,
scoring=['accuracy', 'roc_auc', 'average_precision'],
refit= 'roc_auc',
cv=4,
n_jobs= -1
)
gs.fit(x_train, y_train)
- RandomizedSearchCV
지정된 하이퍼파라미터중 일부를 검증하는 방식 from sklearn.model_selection import RandomizedSearchCV
params = {
'max_depth':range(1, 11),
'max_leaf_nodes': range(3, 31, 3),
'min_samples_leaf':[10,30,50,70,90]
}
rs = RandomizedSearchCV(tree,
param_distributions=params,
scoring='accuracy',
cv=4,
n_jobs=-1,
n_iter=60
)
- 결과 조회
print('가장 좋은 조합의 평가 점수:', gs.best_score_)
print('가장 좋은 조합:', gs.best_params_)
print('조합별 결과 df')
print(pd.DataFrame(gs.cv_results_).sort_values('rank_test_accuracy'))
- Test set 평가
from sklearn.metrics import roc_auc_score
roc_auc_score(y_test, gs.best_estimator_.predict(x_test))
roc_auc_score(y_test, gs.predict(x_test))