비지도학습 | Unsupervised learning 과 딥러닝
- 본 포스팅은 인공지능의 비지도 학습 개념과 종류를 설명하고 있습니다.
- Keyword : Unsupervised learning
- 👉
Click
In traditional machine learning
- K-means clustering
- Hierarchical clustering
- Density estimation
- PCA
특징
- Low dimensional data
- Simple concepts
In Deep learing
Feature engineering vs. Representation learning
- Feature engineering
- By human
- Domain knowledge & Creactivity
- Brainstorming
- Representation learning
- By machine
- Deep learning knowledge & coding skill
- Trial and error
Modern unsupervised learning
- High dimensional data
- Difficult concepts ➔ Not well understood, but surprisingly good performance
- Deep learning
- Unsupervised representation learning
Representation in deep learning
- 0 ~ 2π
- Algorithm thinks : 0 and 2π are different / 0 and 1.9π are far
- (x1, x2) = (cos(θ), sin(θ))
- 0 and 2π are the same
- 0 and 1.9π are close
- Goal : Represent as mathematical object
Human representation problems
- Human can understand
- Human can design with a goal
➔ Good representation in deep learning? : Useful and irrelevant
A well defined task
- Typically, only on attribute of interest is considered as y
- Imagenet - class
- y is well defined because it is simply defined as human selected label
- Good representation - a vague concept (Supervised)
- Even when y is well defined, what do we want for hi and h2?
- Simply say "representation learning successful" if good performance?
- But then there is almost nothing we can sy about hi and h2
- Other than saying "useful information has been well curated"
- Is there anything we can say or pursue?
- For a general purpose, what is a good representation?
- For a well defined supervised task, what should hi and h2 satisfy?
- Good representation - a vague concept (Unsupervised)
- For a general purpose, whawt is a good representation?
- General purpose often defined as a list of downstream tasks?
- So, we go back to good performance for the tasks of interest?
Representation
- What we want: a formal definition and evaluation metrics for representation
- Reality : No definition, task dependent evaluation methods
Unsupervised representation learning
- Unsupervised performance ≈ supervised performance
- For linear evaluation
- Thanks to instance discrimination, contrastive loss, and aggressive augmentation
- As in supervised learning
- Performance metric can be unclear
- Design of surrogate loss is an art (some principled; some hueristics based)
- Training techinique development continuing (but augmentation methods are dominating)
- NLP
- Masked language modeling
- What next?
- Unsupervised representation learning
- Still a long way to go...
References
- 본 포스팅은
LG Aimers
프로그램에 참가하여 학습한 내용을 기반으로 작성되었습니다. (전체내용 X)
➔ LG Aimers
바로가기
[1] LG Aimers AI Essential Course Module 3.비지도학습, 서울대학교 이원종 교수