작업 2유형 : 은행의 전화 마케팅에 대해 고객의 반응 여부

SOOYEON·2022년 6월 23일
0

빅데이터분석기사

목록 보기
32/36

은행의 전화 마케팅에 대해 고객의 반응 여부

data

data

display(train.head())
display(test.head())
display(submission.head())

train

IDagejobmaritaleducationdefaultbalancehousingloancontactdaymonthcampaignpdayspreviouspoutcomey
01382929techniciansingletertiaryno18254nonocellular11may2-10unknownno
12267726servicessinglesecondaryno512yesyesunknown5jun3-10unknownno
21054130managementsinglesecondaryno135nonocellular14aug2-10unknownno
31368941technicianmarriedunknownno30yesnocellular10jul1-10unknownno
41130427admin.singlesecondaryno321noyesunknown2sep1-10unknownno

test

IDagejobmaritaleducationdefaultbalancehousingloancontactdaymonthcampaignpdayspreviouspoutcome
05360832managementsingletertiaryno12569nonocellular1jul22952success
15105525servicessinglesecondaryno801nonocellular5jun2-10unknown
25257346blue-collarmarriedsecondaryno1728yesnounknown26may2-10unknown
35045839managementdivorcedsecondaryno51nonounknown17jun2-10unknown
45227231servicessingletertiaryno1626nonounknown31jul1-10unknown

submission

ID
053608
151055
252573
350458
452272

데이터 분리

from sklearn.model_selection import train_test_split

x = train.drop(columns = ['ID', 'y'])
# drop(['ID', 'y'], axis = 1)
xd = pd.get_dummies(x)
y = train['y']

X_train, X_test, y_train, y_test = train_test_split(xd, y, stratify = y, random_state = 1)

RandomForest

predict_proba(x_test)

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(X_train, y_train)
pred = model.predict_proba(x_test) 

평가

from sklearn.metrics import roc_auc_score, classification_report

print('test roc score : ',roc_auc_score(y_test,pred[:,1]))

## +
print(f'test roc score : {roc_auc_score(y_test, pred[:,1]):.3f}')
print(f'test matrix report : \n {classification_report(y_test, model.predict(X_test))}')
# result
test roc score :  0.7756576420890937

test

pred_test = model.predict_proba(pd.get_dummies(test.drop('ID', axis = 1)))
submission['predict'] = pred_test[:,1]

submission

display(submission.head())
submission.to_csv('submission_.csv', index=False)

IDpredict
0536080.73
1510550.82
2525730.00
3504580.14
4522720.32

0개의 댓글