[AI] SVM (2)

Jiyeahhhยท2021๋…„ 11์›” 25์ผ
0

[Study] AI

๋ชฉ๋ก ๋ณด๊ธฐ
7/7

๐Ÿ’ก ๋งŒ์•ฝ์— training set์ด not linearly separable ํ•˜๋‹ค๋ฉด?

Soft Margin Classification

  • Hard Margin : ๋ชจ๋“  data๊ฐ€ ๋ถ„๋ฅ˜๋˜์–ด์•ผ ํ•จ!
  • Soft Margin : ๋ช‡ ๊ฐœ๋Š” ํ‹€๋ ค๋„ ๋œ๋‹ค๊ณ  ํ—ˆ์šฉ!
    โ‡’ Slack variables (์—ฌ์œ  ๋ณ€์ˆ˜)

  • C : hyper parameter๋กœ ์–ผ๋งŒํผ ํ‹€๋ ค๋„ ๋˜๋Š”๊ฐ€๋ฅผ ์ •์˜
  • N : ๋ชจ๋“  data์— ๋Œ€ํ•ด์„œ margin์œผ๋กœ๋ถ€ํ„ฐ์˜ ๊ฑฐ๋ฆฌ
  • L(w)๊ฐ€ ์ตœ์†Ÿ๊ฐ’์ด ๋˜์–ด์•ผ ํ•จ!

Non-linear SVM

  • ์œ„์™€ ๊ฐ™์€ dataset์€ ๋งค์šฐ ์ž˜ ์ž‘๋™ํ•จ

๐Ÿ’ก ํ•˜์ง€๋งŒ ์•„๋ž˜์™€ ๊ฐ™์€ dataset์€ ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ?

โ‡’ ๊ณ ์ฐจ์›์œผ๋กœ mapping โ—


  • ๋ณดํ†ต (n+1)์ฐจ์›์œผ๋กœ ๊ฐ€๋ฉด separableํ•ด์ง!

๐Ÿ“Œ kernel function

  • mapping ํ•ด์ฃผ๊ณ , ๋‚ด์ ๋„ ํ•ด์ฃผ๋Š” ํ•จ์ˆ˜

Kernel function and Kernel trick

๐Ÿ“Œ Kernel function

  • ๊ณ ์ฐจ์›์—์„œ ๋‚ด์ ํ•ด์„œ ์–ป์–ด์ง€๋Š” ํ•จ์ˆ˜

  • ๋‚ด์  = ๋‘ data ๊ฐ„์˜ similarity

  • reproducing kernel Hilbert space์—์„œ์˜ ์œ ์‚ฌ๋„

  • K(xi,xj)=ฯ†(xi)Tฯ†(xj)K(x_i, x_j) = ฯ†(x^i)^Tฯ†(x_j)

  • ์„ ํ˜•๋Œ€์ˆ˜์—์„œ ๋‚˜์˜ค๋Š” kernel์˜ ๊ฐœ๋…๊ณผ๋Š” ๋‹ค๋ฆ„

๐Ÿ“Œ Kernel trick

  • Avoids the explicit mapping

๐Ÿ“Œ Kernel example

  • 2์ฐจ์› ๊ณต๊ฐ„์˜ data(vector) x=[x1x2];x=[x_1 x_2];

  • K(xi,xj)=(1+xiT,xj)2K(x_i, x_j) = (1+{x_i}^T,x_j)^2

  • ์‹ค์ œ๋กœ ๊ณ ์ฐจ์›์— ๋ณด๋‚ด์ง€ ์•Š์•„๋„ ๊ณ ์ฐจ์›์œผ๋กœ ๋ณด๋‚ด ์œ ์‚ฌ๋„๋ฅผ ์–ป์€ ๊ฐ’๊ณผ ๋˜‘๊ฐ™์Œ
    โ‡’ kernel trick

  • linear ์•Œ๊ณ ๋ฆฌ์ฆ˜์— kernel function๋งŒ ๋„์ž…ํ•˜๋ฉด non-linear ๋ฌธ์ œ๋„ ํ’€ ์ˆ˜ ์žˆ์Œ!


๐Ÿ“Œ ๋Œ€ํ‘œ์ ์ธ Kernel function

  1. Linear kernel
  2. Polynomial kernel
  3. Radial basis kernel (Gaussian)
  4. Hyperbolic Tangent kernel
  • ๋ชจ๋“  ํ•จ์ˆ˜๊ฐ€ kernel function์€ ์•„๋‹˜ โ—

Kernel-based learning methods

  • linear ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ non-linear ๋ฌธ์ œ๋„ ํ’€ ์ˆ˜ ์žˆ๊ฒŒ ๋จ
  1. ๊ณ ์ฐจ์›์œผ๋กœ data ๋ณด๋‚ด๊ธฐ
  2. linear ์•Œ๊ณ ๋ฆฌ์ฆ˜ (SVM, PCA, ridge regression, ...)์œผ๋กœ ํ•™์Šต
  • but, ์š”์ฆ˜์€ ๋”ฅ๋Ÿฌ๋‹๋•Œ๋ฌธ์— ๊ทธ๋ ‡๊ฒŒ ์“ฐ๋Š” ๋ฐฉ๋ฒ•์€ ์•„๋‹˜!

๐Ÿ’ก ๊ทธ๋ ‡๋‹ค๋ฉด kernel function์€ ์–ด๋–ป๊ฒŒ ๋งŒ๋“ค๊นŒ?

  • semi-positive definite symmetric (mercer's theorem)์œผ๋กœ kernel function์ž„์„ ์ฆ๋ช…

  • time complexity ๋ถ„์„

  • compositive kernel
    E.g.) Knew(xi,xj)=xiTxj+(xiTxj)pK_{new}(x_i, x_j) = {x_i}^Tx_j + ({x_i}^Tx_j)^p
    โ‡’ linear kernel + polynomial kernel

profile
๋žŒ์ฐจ๋žŒ์ฐจ

0๊ฐœ์˜ ๋Œ“๊ธ€