Random forest.
this model extracts the samples from data set, and makes a bunch of the decision treem which is called a bagging. Then it estmates them whit mean method for regression and 최빈값 for classificaion.
it is uselfull for predicting the target values and assessing complicated data set.
confusion matrix.
this is a table that represents for the number of each sections, such as TP,TN,FP,and FN.
TP is ture positive which predicts the target info correctly.
TN is ture negative which predicts the non-target values correctly.
this is a fun part.
FP is false positive which is that the model predicts positive but the acual target value is negative. Actully it is 0, or undesired target values.
FN is false negetive which is error that the model predicts the target values as negative but it turns out to be true. Actually it is positive, 1, or desired target values.
this work depends on whether FP or FN can be found.
threshold.
this is a creteria deciding which one is true or not.
for instance, if you set threshole as 0.3, the model will make a decision of true positive over 0.3 point.
ROC score
AUC. area under curve.
I think this is very important when you make the model tuned well, becasue before the model is set, analysing and separating data well is important.