Multiple Linear Regression

Main Purpose: Building a model that accurately predicts the test data (as opposed to the train data)

Train/Test Split

Train Data - used to train the model
Test Data - used to check performance of the model
*** The two must be split in order to prevent data leakage

Simple Linear Regression vs. Multiple Linear Regression

Implementation:

Uses the same fit_transform/transform in the sklearn.linear_model LinearRegression()
Uses the same model.intercept, model.coef
Difference: Uses 2 or more features

Evaluation Metrics:

Overfitting vs. Underfitting

Generalization - a model that returns high performance in both the train and test data.
Overfitting - a model that relies too heavily on the train data and thereby creates a difference/error in generalization
Underfitting - a model that hasn't been able to overfit or generalize. High chance of bias.

a Philosopher aspiring to become an AI/ML/DL Engineer and Data Scientist.