Linear Models of scikit learn

Jin woo Kim·2024년 3월 30일

이번 글에서는 기계학습 수업의 첫번째 퀴즈를 대비하기 위해 scikit learn의 Linear models에 대한 공부를 해 볼 것이다. 모두 regression을 위한 방법들이며, 아래의 링크 내용을 공부해야 한다.

Scikit-learn 1.1.1 ~ 1.1.18

Linear Models

What is linear model?
Target value is expected to be a linear combination of the features.

y\hat{}\,(w,x) = w_0 + w_1 x_1 + ... + w_p x_p

즉, 우리의 예측값(target)은 feature값인 $(x_1, x_2, ... , x_p)$ 의 선형조합으로 표현될 때 선형 모델이라고 한다. 벡터 $\textbf w = (w_1, w_2, ..., w_p)$ 는 coef_ 라는 계수이고 $w_0$ 는 intercept_이다.

1.1.1. Ordinary Least Squares (OLS)

모듈 이름: LinearRegression
최적화 목표는 Residual sum of squares의 최소화이다.

code example:

>>> from sklearn import linear_model
>>> reg = linear_model.LinearRegression()
>>> reg.fit( [[0, 0], [1, 1], [2, 2]], [0, 1, 2]) #Two lists input: 첫번째는 x vectors, 두번째는 true label y값
LinearRegression()
>>> reg.coef_
array([0.5, 0.5])

주의점: Multicollinearity에 주의할 것. Multicollinearity 발생시 matrix가 singular에 가깝게 되고, 모델이 error에 매우 민감하게 반응하게 되어 모델의 variance가 아주 커지게 된다.

Jin woo Kim

Data Science / Economics / Study 요약

이전 포스트

Linear Models of scikit learn

Linear Models

1.1.1. Ordinary Least Squares (OLS)

[EE485] March 15th Class - Linux & Shell

0개의 댓글