비지도학습의 일종으로, 많은 feature로 구성된 다차원 데이터 세트의 차원을 축소해 새로운 차원의 데이터 세트를 생성하는 것
메모리 효율화 및 데이터 시각화에 매우 유용
PCA, t-SNE 등이 있음
[출처:https://steemit.com/steempress/@hellosketch/tr2qjs3dv1]
1) find the Principal Component in the data distribution
→ the directional vector which has the biggest variance
2) find the orthogonal basis(the axis of vector) and make projection from high-dimension to low-dimension
3) Linear combination of previous feature (not choosing the feature)
BASIS : the set of new vector which can play role of new coordinate system
Actually, the coordinate systems we use is the linear combination of BASIS → ex) (0,1) + (1,0)
PC axis : the most important basis