๐ŸŽ„2๋ฒˆ์งธ ๊ฐ“์ƒ์ผ๊ธฐ ~ ๐ŸŽ„
์˜ค๋Š˜์€ iris๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ+GridSeachCV ์—ฐ๊ฒฐํ•ด์„œ ํ•ด๋ณผ๊นŒ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•ด๋ดค๋‹ค


  • ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ ํ™•์ธ
iris = load_iris()
>X = pd.DataFrame(
    iris.data,
    columns=iris.feature_names
) 
y = iris.target
target_names = iris.target_names
print(X.head())

  • ๋ฐ์ดํ„ฐ๋ถ„๋ฆฌ
X_train, X_test, y_train, y_test = train_test_split(
    X, y,test_size=0.2,random_state=42,stratify=y)
print(f"Train: {X_train.shape}")

+๊ฒฐ๊ณผ๊ฐ’


  • ํŒŒ์ดํ”„๋ผ์ธ
pipe_li = Pipeline([
    ('clf',RandomForestClassifier(random_state=42))
])
print("Pipeline ๊ตฌ์„ฑ์™„๋ฃŒ")

+๊ฒฐ๊ณผ๊ฐ’


  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ๊ทธ๋ฆฌ๋“œ
param_grid = {
    'clf__n_estimators' : [10,50,100,200],
    'clf__max_depth' : [None, 10, 20],
    'clf__min_samples_split' : [2,5],
    'clf__min_samples_leaf' : [1, 2]
}
  • GridSearchCV ์ƒ์„ฑ
grid_clf = GridSearchCV(
    pipe_li,
    param_grid,
    cv=StratifiedKFold(5),
    scoring='accuracy',
    n_jobs=-1,
    verbose=1
)
grid_clf.fit(X_train, y_train)
print('GridSearch ์™„๋ฃŒ!')

+๊ฒฐ๊ณผ๊ฐ’


  • ์˜ˆ์ธก,์„ฑ๋Šฅํ™•์ธ
y_pred = grid_clf.best_estimator_.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion_matrix:")
print(confusion_matrix(y_test, y_pred))

+๊ฒฐ๊ณผ๊ฐ’


  • ์‚ฌ๋žŒ๊ธฐ์ค€ ์ด๋ฆ„ํ™•์ธ
predicted_flowers = iris.target_names[y_pred]
true_flowers = iris.target_names[y_test]

print(predicted_flowers[:5])
print(true_flowers[:5])

+๊ฒฐ๊ณผ๊ฐ’


  • ์ƒˆ๊ฝƒ ์ž…๋ ฅ/์˜ˆ์ธก
new_flower = [[5.1, 3.5, 1.4, 0.2]]
pred = grid_clf.best_estimator_.predict(new_flower)
predicted_flowers_name = target_names[pred][0]
print("์˜ˆ์ธก๋œ ๊ฝƒ:", predicted_flowers_name)

+๊ฒฐ๊ณผ๊ฐ’



๐Ÿ–๏ธ๊ณต๋ถ€ ์ •๋ฆฌ๏ผฟใ€†(ใ€‚ใ€‚)๐Ÿ–๏ธ

์‹ค๋ฌด์—์„œ๋Š” ์ฝ”๋žฉ์„ ์ž˜์•ˆ์“ฐ๊ณ  vscode๋ฅผ ์ž์ฃผ์“ด๋‹ค๋Š”๋ง์— ๊น”์•„์„œํ•˜๋Š”๋ฐ ํŒŒ์ด์ฌ์ด ๊น”๋ ค์žˆ๋Š”๋ฐ ์ž๊พธ ์•ˆ๋œ๋‹ค๊ณ  ๋œจ๊ณ  ์—ฌ๊ธฐ์ €๊ธฐ ์„œ์น˜๋ฅผํ•ด๋„ ์•ˆ๋˜์—ˆ์Œ. ๊ฒฐ๋ก ์€ ๊ฐ€์ƒ์„๋งŒ๋“ค์–ด์„œ ํŒŒ์ผ์„ ์—…๋กœ๋“œํ•˜๊ธฐ๋กœํ•จ.

๋™๊ธฐ๋ถ„์˜ ๋„์›€์œผ๋กœ ์ฝ”๋“œ๋ถˆ๋Ÿฌ์˜ค๊ณ  ๊ทธ์ „๋‹จ๊ณ„,์ „์ „๋‹จ๊ณ„๋ฅผ ์™“๋‹ค๋ฆฌ๊ฐ“๋‹ค๋ฆฌ๋„ ํ• ์ˆ˜์žˆ๊ฒŒ๋จ.
๊ทธ๋’ค๋กœ๋Š” ์•„์ฃผ ์ž˜๋˜์—ˆ์Œ.

์˜ค๋Š˜ ๋‚ด๊ฐ€ ์›ํ•˜๋Š” ์ฝ”๋”ฉ ์ˆœ์„œ๋Š”
์•„์ด๋ฆฌ์Šค ->ํŒŒ์ดํ”„๋ผ์ธ->๋žœ๋คํฌ๋ ˆ์ŠคํŠธ(์•™์ƒ๋ธ”)->GridSearchCV->ํ‰๊ฐ€->์ƒˆ๊ฝƒ์˜ˆ์ธก ์ˆœ์œผ๋กœ ์ด์–ด๊ฐ€๊ณ ์‹ถ์—ˆ์Œ.
์ „์ฒ˜๋ฆฌ์™€ ๋ชจ๋ธ์„ pipeline์œผ๋กœ ๊ตฌ์„ฑํ•ด์„œ GridSearchCV๋ฅผ ํ†ตํ•ด์„œ ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ ๋‹จ์œ„๋ฅผ ํŠœ๋‹ํ•˜๋Š”๊ฑธ ํ•ด๋ณด๊ณ  ์‹ถ์—ˆ๊ธฐ ๋•Œ๋ฌธ.

๐Ÿ‘์Šค์Šค๋กœ ์‹ค์Šต ํ•ด๋ณธ๊ฒฐ๊ณผ๐Ÿ‘
1. RandomForest์—์„œ ์Šค์ผ€์ผ๋ง

  • ํŠธ๋ฆฌ ๊ธฐ๋ฐ˜๋ชจ๋ธ์€ ์Šค์ผ€์ผ๋ง์ด ํ•„์ˆ˜๊ฐ€ ์•„๋‹˜
  • ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ˜์—์„œ๋Š” ํ•„์ˆ˜
  • Pipeline์— ๋„ฃ๋Š”๊ฑด ํ™•์žฅ์„ฑ/์ธ๊ด€์„ฑ ๋ชฉ์ ์ด๋‹ค.
  1. GridSearchCV๋Š” "ํŒŒ์ดํ”„๋ผ์ธ์ด ์•„๋‹˜"
  • GridSearchCV๋Š” ๋ชจ๋ธ(or ํŒŒ์ดํ”„๋ผ์ธ)์„ ๊ฐ์‹ธ๋Š” ํŠœ๋„ˆ์ด๋‹ค
  • Pipeline์„ ๋„ฃ์„์ˆ˜๋„ ์žˆ๊ณ , ๋‹จ์ผ ๋ชจ๋ธ์„ ๋„ฃ์„์ˆ˜๋„์žˆ๋‹ค.
  1. Pipeline์—์„œ๋Š” ๋‹จ๊ณ„์ด๋ฆ„__ํŒŒ๋ผ๋ฏธํ„ฐ ๊ทœ์น™์ด ํ•ต์‹ฌ์ด๋‹ค.
  2. ์ง€๋„ํ•™์Šต์€ ์ƒˆ๋กœ์šด ๋‹ต์„ ๋งŒ๋“œ๋Š”๊ฒŒ์•„๋‹ˆ๋ผ ๊ธฐ์กด ํด๋ž˜์Šค์ค‘ ํ•˜๋‚˜๋ฅผ ๊ณ ๋ฅด๋Š”๊ฒƒ์ด๋‹ค.

์•„์ง์€ ๋„ˆ๋ฌด ์–ด๋ ต๊ณ  ํ•˜์ง€๋งŒ ์ฐจ์ฐจ ์‹ค๋ ฅ์ดํ‚ค์›Œ์ ธ๋‚˜๊ฐ€๊ฒŒ ๋…ธ๋ ฅํ• ๊ฑฐ๊ณ  ๊ทธ๋…ธ๋ ฅ์ด ๊ผญ ๋น›์„๋ฐ”๋ž„๊ฑฐ๋ผ ๋‚˜๋Š” ๋ฏฟ์Œ !!! ๋А๋ฆฌ์ง€๋งŒ ๋‚œ์ž˜ํ• ์ˆ˜์žˆ์œผ๋‹ˆ๊นŒ!!!!!๋‚ ๋ฏฟ์–ด๋ณด์ž์•„์•„์•„์•„

profile
๊ณฝ์ˆญ์•„_๋†€์ดํ„ฐ

0๊ฐœ์˜ ๋Œ“๊ธ€