PCA (Principal Component Analysis)

더기덕·2022년 4월 5일

MachineLearning eigen value machine learning python 머신러닝 파이썬

0

Concept

reduce the number of variables of a data set while preserving as much information as possible
a dimensionality-reduction method which transforms a large set of variables into a smaller one that still contains most of the information in the large set
this comes at the expense of accuracy but the trick is to trade a little accuracy for simplicity

Process

compute the covariance matrix

What is Covariance Matrix?
- p x p symmetric matrix that has the covariances associated with all possible pairs of initial variables
How do you compute it?

Compute the eigen vector and eigen value of the covariance matrix

Definition of eigen vectors
Determinant of (A-λE)has to be zero because otherwise, eigen vector will be zero

Eigen Value is the variance of the model ( how much of the variables can be explained)

How Many Dimensions are we going to use?

1. Set a target variance

e.g. 90% → sum of eigen value should be equal to this amount

2. Use the elbow method

Why do we use eigen vectors and eigen values of the covariance matrix?

ref. 공돌이의 수학 노트 - 주성분 분석(PCA)

이전 포스트

Support Vector Machines

다음 포스트

Natural Language Processing

0개의 댓글

관련 채용 정보