[ML] 2. Logistic Regression for Classification

실버버드·2024년 10월 19일

Machine Learning

목록 보기
2/8

Lecture Week 3. Logistic Regression for Classification

Logistic Regression
: a classification algorithm, used when the value of the target variable is categorical in nature. used when the data has binary output, belongs to one class or another, either a 0 or 1
y{0,1}y \in \{0,1\}

Mathematical Representation
0hθ(x)1hθ(x)=g(θTx)=11+eθTx0 \leq h_\theta(x) \leq 1\\ \displaystyle h_\theta(x) = g(\theta^T x) = \frac{1}{1 + e^{\theta^T x}}
g(z)=11+ez\displaystyle g(z) = \frac{1}{1 + e^{-z}}: logistic function (sigmoid)
P(yx;θ)=(hθ(x))y(1hθ(x))1y\displaystyle P(y|x;\theta) = (h_\theta(x))^y(1-h_\theta(x))^{1-y} if y=0 or 1;1

Max Likelihood
x is independent, so the likelihood of all data = the product of the likelihood of each data
L(θ)=P(yX;θ)=i=1mP(yixi;θ)\displaystyle L(\theta) = P(\overrightarrow{y}|X;\theta) = \prod^m_{i=1}P(y^i|x^i;\theta) : likelihood
l(θ)=logL(θ)=i=1myilogh(xi)+(1yi)log(1h(xi))\displaystyle l(\theta) = logL(\theta) = \sum^m_{i=1}y^i log h(x^i) + (1 - y^i)log(1 - h(x^i)) maximize the log likelihood until l(θ)=0l'(\theta)=0
θj:=θj+αl(θ)θj=θj+α(yihθ(xi))xji\displaystyle \theta_j := \theta_j + \alpha\frac{\partial l(\theta)}{\partial\theta_j} = \theta_j + \alpha(y^i - h_\theta(x^i))x_j^i : gradient ascent

Min Cost Function
cost(hθ(x),y)=y(log(hθ(x)))(1y)log(1hθ(x))={log(hθ(x)),  if  y=1log(1hθ(x)),  if  y=0cost(h_\theta(x), y) = -y(log(h_\theta(x))) - (1-y)log(1 - h_\theta(x))\\ = \begin{cases} -log(h_\theta(x)), \; if\; y=1\\ -log(1 - h_\theta(x)), \; if\; y=0 \end{cases}

Newton's Method
: optimize θ\theta such that f(θ)=0f'(\theta)=0

θ:=θf(θ)f(θ)\displaystyle \theta := \theta - \frac{f(\theta)}{f'(\theta)}

maximize l(θ)l(\theta) taking bigger steps, converge earlier than gradient ascent
let f(θ)=l(θ)θ(t+1)=θ(t)l(θ(t))l(θ(t))f(\theta) = l'(\theta)\\ \displaystyle \theta^{(t+1)} = \theta^{(t)} - \frac{l'(\theta^{(t)})}{l''(\theta^{(t)})}

0개의 댓글