AdaBoost์™€ GBM

์•ˆ์„ฑ์ธยท2022๋…„ 3์›” 30์ผ
0

๐Ÿ“– ์•™์ƒ๋ธ” ํ•™์Šต์˜ ์œ ํ˜•์€ ๋ณดํŒ…(Voting), ๋ฐฐ๊น…(Bagging), ๋ถ€์ŠคํŒ…(Boosting), ์Šคํƒœํ‚น(Stacking) ๋“ฑ์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ ๋‹ค๋ฃฐ ์•™์ƒ๋ธ” ๊ธฐ๋ฒ•์€ ๋ถ€์ŠคํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋ฉฐ, ํ•ด๋‹น ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์†ํ•˜๋Š” AdaBoost์™€ GBM์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.
์šฐ์„  ๋ถ€์ŠคํŒ…์˜ ์›๋ฆฌ๋ถ€ํ„ฐ ์ฐจ๊ทผ์ฐจ๊ทผ ์•Œ์•„๋ด…์‹œ๋‹ค.


[๋ถ€์ŠคํŒ… (Boosting)]

  • ๋ถ€์ŠคํŒ…์€ ์•ฝํ•œ ํ•™์Šต๊ธฐ(weak learner)๋ฅผ ์—ฌ๋Ÿฌ ๊ฐœ ์—ฐ๊ฒฐํ•˜์—ฌ ๊ฐ•ํ•œ ํ•™์Šต๊ธฐ๋ฅผ ๋งŒ๋“œ๋Š” ์•™์ƒ๋ธ” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.
  • ์†๋„๋‚˜ ์„ฑ๋Šฅ์ ์ธ ์ธก๋ฉด์—์„œ decision tree๋ฅผ ์•ฝํ•œ ํ•™์Šต๊ธฐ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • boosting์€ bagging์—์„œ ๋‹ค๋ฃจ์—ˆ๋˜ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณต์› ๋žœ๋ค ์ƒ˜ํ”Œ๋ง์„ ํ†ตํ•ด ์—ฌ๋Ÿฌ ๋ชจ๋ธ๋“ค์„ ๋งŒ๋“œ๋Š” ๊ณผ์ •๊นŒ์ง€๋Š” ๊ฐ™์œผ๋‚˜, bagging์€ ๋ณ‘๋ ฌ์ ์œผ๋กœ ๋ชจ๋ธ๋“ค์„ ๊ฒฐํ•ฉํ–ˆ๋”๋ผ๋ฉด boosting์€ ๋‹จ์ผ ๋ชจ๋ธ์„ ์ˆœ์ฐจ์ ์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
  • ํ•˜๋‚˜์˜ ๋ชจ๋ธ์„ ์„ค์ •ํ•˜์—ฌ ์ƒ˜ํ”Œ1๋ถ€ํ„ฐ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ํ›„, ์ƒ˜ํ”Œ1 ์ค‘์—์„œ ์ž˜ ๋ถ„๋ฅ˜ํ•˜์ง€ ๋ชปํ•œ ๋ฐ์ดํ„ฐ๋“ค์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋ถ€์ŠคํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํ•ด๋‹น ๋ฐ์ดํ„ฐ์— ๊ฐ€์ค‘์น˜๋ฅผ ์ฃผ๊ณ  ์ƒ˜ํ”Œ2๋กœ ๋„˜๊ฒจ์ค๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋ณต์› ๋žœ๋ค ์ƒ˜ํ”Œ๋ง์ด๊ธฐ ๋•Œ๋ฌธ์— ์ƒ˜ํ”Œ2์—๋Š” ์ƒ˜ํ”Œ1์— ์‚ฌ์šฉ๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๋„ ์กด์žฌํ•  ๊ฒƒ์ด๊ณ  ํ•ด๋‹น ๋ฐ์ดํ„ฐ์—๋„ ๊ฐ€์ค‘์น˜๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ๊ฒฐ๊ตญ, ๋งˆ์ง€๋ง‰์œผ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ๋งŒ์ด ์•„๋‹ˆ๋ผ ์ด์ œ๊นŒ์ง€ ํ•™์Šต๋œ ๋ชจ๋ธ๋“ค์„ ๋ชจ๋‘ ๊ณ ๋ คํ•ด์„œ ์ ์šฉํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
  • ์ด์ฒ˜๋Ÿผ ์˜ค์ฐจ์— ๋Œ€ํ•œ ๊ฐ€์ค‘์น˜๋ฅผ ์ฃผ๋Š” ๋ฐฉ์‹์€ ์ •ํ™•๋„๋ฅผ ๋†’์ด๋Š” ๊ฒƒ์— ํฐ ๋„์›€์ด ๋˜์ง€๋งŒ, outlier์— ์ทจ์•ฝํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

[์—์ด๋‹ค๋ถ€์ŠคํŠธ (AdaBoost)]

  • Adaptive(์ƒํ˜ธ๋ณด์™„์ ) Boost์˜ ์ค„์ž„๋ง๋กœ์„œ ์ผ๋ฐ˜์ ์ธ ๋ถ€์ŠคํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜์ฒ˜๋Ÿผ ์•ฝํ•œ ํ•™์Šต๊ธฐ์˜ ์˜ค๋ฅ˜ ๋ฐ์ดํ„ฐ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌํ•˜๋ฉด์„œ ๋ถ€์ŠคํŒ…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค.
  • ์ฆ‰, ์ฒ˜์Œ ๋งŒ๋“  ๋ฐ์ดํ„ฐ์—์„œ ์ž˜๋ชป ๋ถ„๋ฅ˜ํ•œ ๋ฐ์ดํ„ฐ์— ๊ฐ€์ค‘์น˜๋ฅผ ์ฆ๊ฐ€์‹œ์ผœ ๋‹ค์Œ ๋ชจ๋ธ์ด ์ˆœ์ฐจ์ ์œผ๋กœ ๋งŒ๋“ค์–ด์งˆ ๋•Œ ๋‹ค์‹œ ์„ ํƒ๋  ํ™•๋ฅ ์„ ๋†’์—ฌ ๋” ๋งŽ์ด ํ•™์Šต๋  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค์Œ ๋ชจ๋ธ๋“ค์ด ๋ถ„๋ฅ˜๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ํ•ด๋‚ด๊ฒŒ ๋˜๋ฉด ๋‹ค์‹œ ๊ทธ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ์†Œ์‹œํ‚ต๋‹ˆ๋‹ค. ๊ทธ ๊ณผ์ •์„ ํ†ตํ•ด ์ตœ์ข… ๋ชจ๋ธ์ด ์™„์„ฑ๋ฉ๋‹ˆ๋‹ค.
  • ๊ธฐ์กด ๋ถ€์ŠคํŒ…์—์„œ๋Š” ๊ฐœ๋ณ„ ๋ชจ๋ธ์—์„œ ๋™์ผํ•œ ๊ฐ€์ค‘์น˜๋ฅผ ์ฃผ์—ˆ๋‹ค๋ฉด adaboost๋Š” ๊ฐœ๋ณ„ ๋ชจ๋ธ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ณ„๋„๋กœ ์ฃผ๋Š” ๊ฐœ๋…์ด ์ถ”๊ฐ€๋œ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.
  • ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์˜์‚ฌ๊ฒฐ์ •๋‚˜๋ฌด์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š”๋ฐ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

1. ํ•™์Šต ๋ฐฉ๋ฒ•

  • 3๊ฐœ์˜ ์•ฝํ•œ ํ•™์Šต๊ธฐ๋ฅผ ์˜ˆ์‹œ๋กœ ์„ค๋ช…ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

1) ์ฒซ ๋ฒˆ์งธ ์•ฝํ•œ ํ•™์Šต๊ธฐ๊ฐ€ ์ฒซ ๋ฒˆ์งธ ๋ถ„๋ฅ˜๊ธฐ์ค€(D1)์œผ๋กœ +์™€ -๋ฅผ ๋ถ„๋ฅ˜ํ•ด์„œ ์˜ค๋ฅ˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ‰์ถœ

(๋ถ„๋ฅ˜๋Š” ์˜ค๋ถ„๋ฅ˜๋œ ๋ฐ์ดํ„ฐ๋“ค์˜ ๊ฐ€์ค‘์น˜ ํ•ฉ์„ ์ตœ์†Œํ™”ํ•˜๋Š” ์ชฝ์œผ๋กœ ์ง„ํ–‰)

2) ์ž˜๋ชป ๋ถ„๋ฅ˜๋œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌ (์ปค์ง„ +ํ‘œ์‹œ)

3) ๋‘ ๋ฒˆ์งธ ์•ฝํ•œ ํ•™์Šต๊ธฐ๊ฐ€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ๋‘ ๋ฒˆ์งธ ๋ถ„๋ฅ˜๊ธฐ์ค€(D2)์œผ๋กœ +์™€ -๋ฅผ ๋‹ค์‹œ ๋ถ„๋ฅ˜ํ•ด์„œ ์˜ค๋ฅ˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ‰์ถœ

4) ์ž˜๋ชป ๋ถ„๋ฅ˜๋œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ฐ€์ค‘์น˜๋ฅผ ๋ถ€์—ฌ (์ปค์ง„ -ํ‘œ์‹œ)

(์˜ค๋ถ„๋ฅ˜ ๋ฐ์ดํ„ฐ์—๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ์ฆ๊ฐ€์‹œํ‚ค๊ณ  ์•„๋‹ˆ๋ผ๋ฉด ๊ฐ์†Œ์‹œ์ผœ ๋ฐ์ดํ„ฐ ๊ฐ€์ค‘์น˜๋ฅผ ์ดˆ๊ธฐํ™”)

5) ์„ธ ๋ฒˆ์งธ ์•ฝํ•œ ํ•™์Šต๊ธฐ๊ฐ€ ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์„ธ ๋ฒˆ์งธ ๋ถ„๋ฅ˜๊ธฐ์ค€(D3)์œผ๋กœ +์™€ -๋ฅผ ๋‹ค์‹œ ๋ถ„๋ฅ˜ํ•ด์„œ ์˜ค๋ฅ˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ‰์ถœ

6) ๋งˆ์ง€๋ง‰์œผ๋กœ 3๊ฐœ์˜ ๋ถ„๋ฅ˜๊ธฐ์ค€๋ณ„๋กœ ๊ณ„์‚ฐ๋œ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ๊ฐ๊ฐ ๋ถ€์—ฌํ•˜๊ณ  ๊ฐ€์ค‘ํ•ฉ์œผ๋กœ ๊ฒฐํ•ฉํ•˜์—ฌ ๋ถ„๋ฅ˜


2. ์žฅ์ 

  • ๊ณผ์ ํ•ฉ๋˜๋Š” ๊ฒฝํ–ฅ์„ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Random Forest์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ ์ด์ง„๋ถ„๋ฅ˜์—์„œ ๋Œ€์ฒด์ ์œผ๋กœ ๊ฒฐ๊ณผ๊ฐ€ ๋” ์ข‹๊ฒŒ ๋‚˜์˜ต๋‹ˆ๋‹ค.

3. ๋‹จ์ 

  • ๋…ธ์ด์ฆˆ ๋ฐ์ดํ„ฐ ๋ฐ ์ด์ƒ์น˜์— ๋ฏผ๊ฐํ•ฉ๋‹ˆ๋‹ค.

4. ์‹ค์Šต

> library(caret)

> # ['ada'๊ฐ€ ํฌํ•จ๋œ ๋ชจ๋ธ ํ™•์ธ]
> grep('ada', names(getModelInfo()), value = T, ignore.case = T)
[1] "ada"         "AdaBag"      "AdaBoost.M1" "adaboost"    "mxnetAdam"  

> # [adaboost์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ํ™•์ธ]
> modelLookup('adaboost')
     model parameter  label forReg forClass probModel
1 adaboost     nIter #Trees  FALSE     TRUE      TRUE
2 adaboost    method Method  FALSE     TRUE      TRUE

> # [์ตœ์ ์˜ ๋ชจ์ˆ˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ํ•™์Šต๋ฐฉ๋ฒ• ์‚ฌ์ „ ์„ค์ •]
> control <- caret::trainControl(method = 'repeatedcv',
+                                search = 'random',
+                                ## ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ random search
+                                number = 3,
+                                repeats = 3,
+                                allowParallel = T,
+                                verboseIter = T
+ )

> # [๋ชจ๋ธ ํ•™์Šต ๋ฐ ๊ฒ€์ฆ]
> ada_model <- train(credit.rating ~., train,
+                      method = "adaboost",
+                      metric = 'Accuracy',
+                      preProcess = c("zv", "center", "scale", "spatialSign"),
+                      # tuneLength = 7,
+                      trControl = control)
+ Fold1.Rep1: nIter=116, method=Adaboost.M1 
- Fold1.Rep1: nIter=116, method=Adaboost.M1 
+ Fold1.Rep1: nIter=506, method=Adaboost.M1 ...

> # [์ตœ์ ๋ชจ๋ธ ๋„์ถœ]
> ada_model
AdaBoost Classification Trees 

700 samples
 20 predictor
  2 classes: 'pos', 'neg' 

Pre-processing: centered (20), scaled (20), spatial sign
 transformation (20) 
Resampling: Cross-Validated (3 fold, repeated 3 times) 
Summary of sample sizes: 466, 467, 467, 467, 467, 466, ... 
Resampling results across tuning parameters:

  nIter  method         Accuracy   Kappa    
   25    Real adaboost  0.7242785  0.2222041
  207    Adaboost.M1    0.7385724  0.3222722
  672    Adaboost.M1    0.7419085  0.3356293

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were nIter = 672 and method
 = Adaboost.M1.

> # [๋ณ€์ˆ˜ ์ค‘์š”๋„ ํ™•์ธ]
> plot(varImp(ada_model))

> # [ํ˜ผ๋ˆํ–‰๋ ฌ]
> confusionMatrix(test$credit.rating, predict(ada_model, test, type = 'raw'))
Confusion Matrix and Statistics

          Reference
Prediction pos neg
       pos  39  51
       neg  24 186
                                        
               Accuracy : 0.75          
                 95% CI : (0.697, 0.798)
    No Information Rate : 0.79          
    P-Value [Acc > NIR] : 0.95956       
                                        
                  Kappa : 0.349         
                                        
 Mcnemar's Test P-Value : 0.00268       
                                        
            Sensitivity : 0.6190        
            Specificity : 0.7848        
         Pos Pred Value : 0.4333        
         Neg Pred Value : 0.8857        
             Prevalence : 0.2100        
         Detection Rate : 0.1300        
   Detection Prevalence : 0.3000        
      Balanced Accuracy : 0.7019        
                                        
       'Positive' Class : pos    

[๊ทธ๋ž˜๋””์–ธํŠธ ๋ถ€์ŠคํŒ… ๋จธ์‹  (GBM)]

  • ์•ž์„  adaBoost์˜ ๊ฒฝ์šฐ ์˜ค๋ถ„๋ฅ˜ํ•œ ๋ฐ์ดํ„ฐ์— ๊ฐ€์ค‘์น˜๋ฅผ ์คŒ์œผ๋กœ์จ ๋ชจ๋ธ์„ ๋ณด์™„ํ–ˆ์ง€๋งŒ, GBM์˜ ๊ฒฝ์šฐ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ๋ฅผ ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(Gradient Descent)์„ ์ด์šฉํ•˜์—ฌ ์ตœ์ ํ™”๋œ ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค.

    • ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•์€ loss fuction์„ ์ •์˜ํ•˜๊ณ  ์ด์˜ ๋ฏธ๋ถ„๊ฐ’์ด ์ตœ์†Œ๊ฐ€ ๋˜๋„๋ก ํ•˜๋Š” ๋ฐฉํ–ฅ์„ ์ฐพ๊ณ  ์ ‘๊ทผํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด loss function์„ squared error๋กœ ์ •์˜ํ•œ๋‹ค๋ฉด ์•„๋ž˜์™€ ๊ฐ™์€ ์‹์œผ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋ฏธ๋ถ„ํ•œ๋‹ค๋ฉด ๊ธฐ์šธ๊ธฐ ๊ฐ’์„ ๊ตฌํ•  ์ˆ˜ ์žˆ๊ณ  ์ด๊ฒƒ์„ ์ตœ์†Œ๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ž…๋‹ˆ๋‹ค.

    • ํŽธ๋ฏธ๋ถ„์„ ํ†ตํ•ด ์–ป์€ gradient๋Š” ๊ฒฐ๊ตญ f(x)๊ฐ€ loss๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ๊ฐ€์•ผํ•˜๋Š” ๋ฐฉํ–ฅ์ธ๋ฐ ์‹์„ ๋ณด์‹œ๋‹ค์‹œํ”ผ residual์ด๋ผ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • AdaBoost์ฒ˜๋Ÿผ ๋ฐ˜๋ณต๋งˆ๋‹ค ์ƒ˜ํ”Œ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์กฐ์ •ํ•˜๋Š” ๋Œ€์‹  ์ด์ „ ์˜ˆ์ธก๊ธฐ๊ฐ€ ๋งŒ๋“  ์ž”์—ฌ ์˜ค์ฐจ(residual error)์— ์ƒˆ๋กœ์šด ์˜ˆ์ธก๊ธฐ๋ฅผ ํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค.

  • ์ด์ „ ๋ชจ๋ธ์˜ residual์„ ๊ฐ€์ง€๊ณ  weak learner๋ฅผ ๊ฐ•ํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, residual์„ ์˜ˆ์ธกํ•˜๋Š” ํ˜•ํƒœ์˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

1. ํ•™์Šต ๋ฐฉ๋ฒ•

  • ์ˆ˜์น˜ํ˜• ๋ฐ˜์‘๋ณ€์ˆ˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ํšŒ๊ท€ ๋ฌธ์ œ์—์„œ, 3๊ฐœ์˜ ๋ชจ๋ธ์„ ์ด์šฉํ•˜์—ฌ ์˜ˆ์‹œ๋ฅผ ๋“ค์–ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

    1) ๋จผ์ € A๋ผ๋Š” ๋ชจ๋ธ์€ ์šฐ์„  ์ „์ฒด ๋ฐ์ดํ„ฐ์˜ target variable์˜ ํ‰๊ท ์œผ๋กœ ์˜ˆ์ธก๊ฐ’์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
    ์ด ํ›„, ์‹ค์ œ๊ฐ’ - (A๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’(=ํ‰๊ท ๊ฐ’) = ์ž”์ฐจ(residual)๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    2) ์ด์ œ B๋ชจ๋ธ์€ A๋ชจ๋ธ์˜ ํ•™์Šต์— ์‚ฌ์šฉํ–ˆ๋˜ features๋ฅผ ๊ฐ€์ง€๊ณ  residual์„ ๋งž์ถ”๋Š” ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ๊ฐ’ - (A๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’(=ํ‰๊ท ๊ฐ’) + learning_rateB๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’) = ์ƒˆ๋กœ์šด residual์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    3) ๊ทธ ๋‹ค์Œ C๋ชจ๋ธ๋„ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, A๋ชจ๋ธ์˜ ํ•™์Šต์— ์‚ฌ์šฉํ–ˆ๋˜ features๋ฅผ ๊ฐ€์ง€๊ณ  ๋ฐ”๋กœ ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ ์ƒˆ๋กœ์šด residual์„ ๋งž์ถ”๋„๋ก ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.
    ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ๊ฐ’ - (A๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’ + learning_rate
    B๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’ + learning_rate*C๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’) = ์ƒˆ๋กœ์šด residual์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.
    4) ๊ทธ ๋‹ค์Œ ๋ชจ๋ธ์ด ์žˆ๋‹ค๋ฉด ๋‹ค์‹œ ์ƒˆ๋กœ์šด residual์„ ํ•™์Šตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

๊ตฌ์ฒด์ ์ธ ํ•™์Šต ํ”„๋กœ์„ธ์Šค ์ฐธ๊ณ 

2. ์žฅ์ 

  • ML ๊ณ„์—ด์˜ ๋ชจ๋ธ ์ค‘ ์„ฑ๋Šฅ์ด ์ข‹์€ ํŽธ์— ์†ํ•ฉ๋‹ˆ๋‹ค.

3. ๋‹จ์ 

  • ์ˆ˜ํ–‰์‹œ๊ฐ„์ด ์˜ค๋ž˜๊ฑธ๋ฆฝ๋‹ˆ๋‹ค.
  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
  • ํƒ์š•์  ์•Œ๊ณ ๋ฆฌ์ฆ˜(Greedy Algorithm)์œผ๋กœ ๊ณผ์ ํ•ฉ์ด ๋น ๋ฅด๊ฒŒ ์ง„ํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
    • ํƒ์š•์  ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋ž€, ๋ฏธ๋ž˜๋ฅผ ์ƒ๊ฐํ•˜์ง€ ์•Š๊ณ  ๊ฐ ๋‹จ๊ณ„์—์„œ ๊ฐ€์žฅ ์ตœ์„ ์˜ ์„ ํƒ์„ ํ•˜๋Š” ๊ธฐ๋ฒ•์„ ๋งํ•ฉ๋‹ˆ๋‹ค.

4. ์‹ค์Šต

> # [์ตœ์ ์˜ ๋ชจ์ˆ˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ํ•™์Šต๋ฐฉ๋ฒ• ์‚ฌ์ „ ์„ค์ •]
> fitControl <- trainControl(method = "repeatedcv",
+                          number = 3,
+                          repeats = 3,
+                          verboseIter = T)

> # [gbm์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ํ™•์ธ]
> modelLookup('gbm')
  model         parameter                   label forReg forClass
1   gbm           n.trees   # Boosting Iterations   TRUE     TRUE
2   gbm interaction.depth          Max Tree Depth   TRUE     TRUE
3   gbm         shrinkage               Shrinkage   TRUE     TRUE
4   gbm    n.minobsinnode Min. Terminal Node Size   TRUE     TRUE
  probModel
1      TRUE
2      TRUE
3      TRUE
4      TRUE

> # [๊ทธ๋ฆฌ๋“œ์„œ์น˜๋ฅผ ์œ„ํ•œ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์„ค์ •]
> tunegrid2 <- expand.grid(n.trees = c(10, 20, 30, 40),
+                          ## ์ƒ์„ฑํ•  ๋‚˜๋ฌด์˜ ๊ฐœ์ˆ˜
+                          interaction.depth = c(1:10),
+                          ## = maxdepth
+                          shrinkage = c(0.1),
+                          #=learning Rate
+                          n.minobsinnode = c(10:50)
+                          ## ๋ถ„ํ• ์„ ์‹œ์ž‘ํ•  ๋…ธ๋“œ์˜ ์ตœ์†Œ ํ›ˆ๋ จ ์„ธํŠธ ์ƒ˜ํ”Œ ์ˆ˜
+ )

> # [๋ชจ๋ธ ํ•™์Šต ๋ฐ ๊ฒ€์ฆ]
> gbm_gridsearch2 <- train(credit.rating~.,
+                          data = train,
+                          method = 'gbm',
+                          metric = ifelse(is.factor(train$credit.rating),'Accuracy','RMSE'),
+                          tuneGrid = tunegrid2,
+                          trControl = fitControl)

> # [ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋”ฐ๋ฅธ accuracy ํ๋ฆ„ ํŒŒ์•…]
> trellis.par.set(caretTheme())
> plot(gbm_gridsearch2)

> # [๋ณ€์ˆ˜ ์ค‘์š”๋„ ํ™•์ธ]
> summary(gbm_gridsearch2)
                                                          var    rel.inf
account.balance                               account.balance 21.1839170
credit.amount                                   credit.amount 19.3042177
credit.duration.months                 credit.duration.months 10.8977305
credit.purpose                                 credit.purpose  6.9162193
age                                                       age  6.9126727
savings                                               savings  5.8127936
previous.credit.payment.status previous.credit.payment.status  5.6003470
current.assets                                 current.assets  5.5444213
installment.rate                             installment.rate  4.1435804
employment.duration                       employment.duration  3.6837592
marital.status                                 marital.status  3.1228349
residence.duration                         residence.duration  1.9844654
apartment.type                                 apartment.type  1.9317795
other.credits                                   other.credits  1.2082644
dependents                                         dependents  1.0179670
telephone                                           telephone  0.4575234
occupation                                         occupation  0.2775065
guarantor                                           guarantor  0.0000000
bank.credits                                     bank.credits  0.0000000
foreign.worker                                 foreign.worker  0.0000000

> # [ํ˜ผ๋ˆํ–‰๋ ฌ]
> confusionMatrix(test$credit.rating, predict(gbm_gridsearch2, test))
Confusion Matrix and Statistics

          Reference
Prediction pos neg
       pos  40  50
       neg  23 187
                                         
               Accuracy : 0.7567         
                 95% CI : (0.704, 0.8041)
    No Information Rate : 0.79           
    P-Value [Acc > NIR] : 0.929670       
                                         
                  Kappa : 0.3663         
                                         
 Mcnemar's Test P-Value : 0.002342       
                                         
            Sensitivity : 0.6349         
            Specificity : 0.7890         
         Pos Pred Value : 0.4444         
         Neg Pred Value : 0.8905         
             Prevalence : 0.2100         
         Detection Rate : 0.1333         
   Detection Prevalence : 0.3000         
      Balanced Accuracy : 0.7120         
                                         
       'Positive' Class : pos      
profile
ํ•จ๊ป˜ ๊ณต๋ถ€ํ•ด์š”!

0๊ฐœ์˜ ๋Œ“๊ธ€