: 예측값
: 특성(feature-컬럼)
: 가중치(weight), 회귀계수(regression coefficient). 특성이 에 얼마나 영향을 주는지 정도
: 절편
: p 번째 특성(feature)/p번째 가중치
: i번째 관측치(sample)
가중치 w: 양수-양의 상관관계(비례)/ 음수-음의 상관관계(반비례)
.coef : 가중치
.intercept : 편향-절편
#데이터
from dataset import get_boston_dataset
X_train, X_test, y_train, y_test = get_boston_dataset()
#전처리
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
#모델 생성 학습
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)
#가중치/절편
# 학습후 각 feature들에 곱할 가중치들.
lr.coef_
# bias(편향-절편)
lr.intercept_
#추론
#추론
pred_1 = lr.predict(X_train_scaled[0].reshape(1, -1))
@@확인
X_train_scaled[0] @ lr.coef_ + lr.intercept_
pred_train = X_train_scaled @ lr.coef_.reshape(-1, 1) + lr.intercept_
pred_train.shape
pred_train2 = lr.predict(X_train_scaled)
pred_train2.shape
(pred_train == pred_train2).sum()
@@
#평가
from metrics import print_metrics_regression
print_metrics_regression(y_train, pred_train, "train set")
pred_test = lr.predict(X_test_scaled)
print_metrics_regression(y_test, pred_test, "test set")