[논문 리뷰] (ImgCls) Prototypical Networks for Few-shot Learning

빵 반죽·2023년 8월 24일

Few-shot Image Classification

목록 보기

2/3

Matching Networks(Vinyals et al.)
- embedding is learned(via neural network)
- weighted nearest-neighbor classifier
meta-learning via LSTM(Ravi, Larochelle)
- learns to train a custom model for each episode
very little data -> prone to overfit
- take advantage of this fact and assume that there exist a single embedding that represents each class("a prototype").
- this prototype is defined as an average of its support set embeddings.
- embeddings are generated by neural networks
the importance of choosing a good metric

Prototype $c_k\in\mathbb{R}^M$
- Embedding Function $f_{\phi}:\mathbb{R}^D\rarr\mathbb{R}^M$ ( $\phi$ is a learnable parameter)
- $c_k$ is an average of embedded support points in $S_k$ (for class $k$ )
- distance function $d:\mathbb{R}^M\times\mathbb{R}^M\rarr[0,+\infin)$
- probablility that $\bold{x}$ is class $k$ = $softmax(-d(f_{\phi}(\bold{x}),c_k))$
- Objective: minimize $J(\phi)=-\log p_{\phi}(y=k|\bold x)$ for true class $k$
Connections to Mixture Density Estimation
- If $d$ is defined as a Bregman divergence such as Euclidean distance, Prototypical Network is equivalent to mixture density estimation. (~~)

Training Setting
- shot -> better to match test & training set
  - 모델에 따라 다른 것 같다, 다른 논문에서는 shot 수를 다르게 한 것이 더 좋았다. 뭐가 달라서 이렇게 되는거지..
- way -> larger way for training gives better results
  - way를 크게 할 수록 더 미세한 차이도 임베딩하도록 학습한다. 어려운 테스크로 훈련하면서 일반화가 잘 된다.
Metric
- Euclidean distance is much more suitable for this model as prototypes are defined to be the average of the supporting points. Results are far superior compared to cosine similarity.