Machine Learning - LinearReg-Lasso, Ridge

화이티 ·2023년 12월 17일
0

Machine Learning

목록 보기
4/23
post-thumbnail

Linear Regression: Regularation Lasso, Ridge

Lasso and Ridge are two regularization techniques used in machine learning, especially in linear regression, to prevent overfitting and improve the model's generalization performance. Both Lasso and Ridge introduce a regularization term to the cost function, but they differ in the type of regularization they apply.

  1. Lasso (L1 Regularization):
    • Regularization Term: Lasso adds the absolute values of the coefficients (weights) to the cost function.
    • 0보다 작은 것도 0취급
    • 0이 된것들은 사용 X
    • Purpose: It encourages sparsity in the model, meaning it tends to force some of the coefficient values to be exactly zero. This makes Lasso useful for feature selection.
    • Use Case: When you suspect that many features are irrelevant or redundant, and you want to automatically perform feature selection.
    • Mathematical Expression:Here, λ is the regularization parameter, and |w_i| represents the absolute values of the coefficients.
  2. Ridge (L2 Regularization):
    • Regularization Term: Ridge adds the squared values of the coefficients to the cost function.
    • Purpose: It penalizes large coefficients but does not force them to be exactly zero. Ridge is effective in situations where most features are likely to be relevant, and it helps to prevent multicollinearity.
    • Use Case: When you want to prevent the model from relying too much on any particular feature and you are less concerned about feature selection.
    • Mathematical Expression:Here, λ is the regularization parameter, and w_i^2 represents the squared values of the coefficients.

To summarize, Lasso is suitable when you want to perform feature selection by driving some coefficients to zero, while Ridge is suitable when you want to prevent the model from relying too heavily on any particular feature and still include all features in the model. The choice between Lasso and Ridge often depends on the specific characteristics of the dataset and the goals of the modeling task. Additionally, there is also the Elastic Net regularization, which combines both L1 and L2 regularization.

Lasso: 모든 가중치에 똑같은 값으로 규제 > 특성 선택

Ridge:

from sklearn.linear_model import Lasso, Ridge
#규제 조절: alpha
# alpha 상승: 규제를 많이 가한다
# alpha 하강: 규제를 조금 가한다 , alpha를 너무 낮게잡으면 linear regression과 같은 결과
lasso = Lasso(alpha = 0.1)
ridge = Ridge(alpha = 0.1)
lasso.fit(X_train,y_train)
ridge.fit(X_train,y_train)
from sklearn.model_selection import cross_val_score
cross_val_score(lasso, X_train, y_train, cv =5).mean()
cross_val_score(ridge, X_train, y_train, cv =5).mean()
profile
열심히 공부합시다! The best is yet to come! 💜

0개의 댓글