[REVIEW] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

SHIN·2023년 5월 29일

Few-shot Learning MAML meta learning

1. Introduction

Proposing general and model-agnostic meta-learning algorithm.

2. Model-Agnostic Meta-Learning(MAML)

2.1. Meta learning scenario

Meta-training part

A task $T_{i}$ is drawn from task distribution $p(T)$ .
With $T_i$ , the model is trained with only $K$ samples and feedback from $L_i$ .
Tested on new samples from $T_i$ , this test error is considered when imporving model f(generaly expressed as 'model f' since meta-learnable object is differ from algorithms to parameters) with respect to the parameters.(Thus this test error serves as training error.)

Meta-testing part

New tasks are sampled from $p(T)$ , and meta-performance is measured after learning from K samples.(Tasks used for meta-testing are out held out during meta-training)

2.2. A Model-Agnostic Meta-Learning Algorithm

Intuition

Some representations are more transferable.
-> Let Neural networks learn features from such representations, thus broadly applicable to all tasks.

Problems

Existence of such representations.(Personal question)
How can we tell which one is the one.

Approach

It is a relative matter. Not looking for an absolute transferable, but relatively more transferable representation.
By finding the model that makes most rapid adjustment on new tasks from $p(T)$ . Thus able to evoke a large improvement from a small change. More transferable.

Setup

$\mathcal{T_i}\sim p(\mathcal{T})$ : distribution of tasks
$\mathcal{T}_i$ : batch of tasks, $\mathcal{T}_i = \{\tau_1,\cdots,\tau_i\}$
$\tau_m$ : single task, $\tau_m = \{x_m^j,y_m^j\}_{j=1}^{J}$

Algorithm

Inner loop

For all $\mathcal{T_i}=\{\tau_1,\cdots,\tau_i\}$ , sample K data points $\mathcal{D}^s$ each and compute gradient descent.
- $\theta_i' = \theta - \alpha\nabla_\theta\mathcal{L}_{\mathcal{T_i}}(f_\theta)$
Get $i$ number of inner loop parameters
Sample $\mathcal{D}^q$ for meta update (outer loop)
For each task $\tau_m$ , we have $\theta_m',\mathcal{D}_m^q$ where $m = \{1,\cdots,i\}$

Outer loop

For each tasks $\tau_m$ , compute $\mathcal{L}_m(f_{\theta'},\mathcal{D}^q)$
Compute gradient descent with sum of all losses.
- $\theta \leftarrow \theta-\beta\nabla_\theta\displaystyle\sum_{\mathcal{T_i\sim P(\mathcal{T})}}\mathcal{L}_{\mathcal{T_i}}(f_{\theta'_{i}})$
- Note that, since $\theta_i' = \theta - \alpha\nabla_\theta\mathcal{L}_{\mathcal{T_i}}(f_\theta)$ , $\theta$ is driven with 2nd order differentiation.

Concequently, the meta-object is as follows:

Meta test step (Meta performance test)

Sample batch of task $\mathcal{T}_i^* \sim p(\mathcal{T})/\mathcal{T}_i$
Test it on full algorithm
- Purpose of the model includes training.

3. Species of MAML

3-1. Supervised Regression and Classification

MSE for regression

CE-loss for classification

MAML for Few-Shot Supervised Learning

Note that, K-shot classification tasks use K input/output pairs from each class, thus NK data points for N-way classification.

3.2.Reinforcement Learning

Definitions

$T$ = { $L(x_{1}, a_{1}, . . . , x_{H}, a_{H}), q(x_{1}), q(x_{t+1}|x_{t}, a_{t}), H$ }
$T$ : Task (Each learning problems)
$L(x_{t},a_{t})$ : Loss function with observation $x_{t}$ , output $a_{t}$
$q(x_{t})$ : A distribution of initial observations
$q(x_{t+1}|x_{t}, a_{t})$ : A transition distribution
$H$ : An episode length. A cycle length of generating an output of a query set. (Each time $t$ , model generates samples of length H by chooosing an output $a_{t}$ )

Algorithm

Terminology

task : (Classification for example) Each work given specific classes to perform classification, where this specific classes may not include whole class range. Thus, there might be more than one tasks under one dataset.

SHIN

HAPPY the cat

이전 포스트

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

다음 포스트

[REVIEW] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

1. Introduction

2. Model-Agnostic Meta-Learning(MAML)

2.1. Meta learning scenario

Meta-training part

Meta-testing part

2.2. A Model-Agnostic Meta-Learning Algorithm

Intuition

Problems

Approach

Setup

Algorithm

Inner loop

Outer loop

Meta test step (Meta performance test)

3. Species of MAML

3-1. Supervised Regression and Classification

MSE for regression

CE-loss for classification

MAML for Few-Shot Supervised Learning

3.2.Reinforcement Learning

Definitions

Algorithm

Terminology

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

[REVIEW] Meta-Learning with a Geometry-Adaptive Preconditioner

0개의 댓글

관련 채용 정보