[REVIEW] Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

SHIN·2023년 5월 29일
0

1. Introduction

 Proposing general and model-agnostic meta-learning algorithm.

2. Model-Agnostic Meta-Learning(MAML)

 2.1. Meta learning scenario

  Meta-training part

  1. A task TiT_{i} is drawn from task distribution p(T)p(T).
  2. With TiT_i, the model is trained with only KK samples and feedback from LiL_i.
  3. Tested on new samples from TiT_i , this test error is considered when imporving model f(generaly expressed as 'model f' since meta-learnable object is differ from algorithms to parameters) with respect to the parameters.(Thus this test error serves as training error.)

  Meta-testing part

  1. New tasks are sampled from p(T)p(T), and meta-performance is measured after learning from K samples.(Tasks used for meta-testing are out held out during meta-training)

2.2. A Model-Agnostic Meta-Learning Algorithm

 Intuition

  • Some representations are more transferable.
    -> Let Neural networks learn features from such representations, thus broadly applicable to all tasks.

 Problems

  1. Existence of such representations.(Personal question)
  2. How can we tell which one is the one.

 Approach

  1. It is a relative matter. Not looking for an absolute transferable, but relatively more transferable representation.
  2. By finding the model that makes most rapid adjustment on new tasks from p(T)p(T). Thus able to evoke a large improvement from a small change. More transferable.

Setup

  • Tip(T)\mathcal{T_i}\sim p(\mathcal{T}) : distribution of tasks
  • Ti\mathcal{T}_i : batch of tasks, Ti={τ1,,τi}\mathcal{T}_i = \{\tau_1,\cdots,\tau_i\}
  • τm\tau_m : single task, τm={xmj,ymj}j=1J\tau_m = \{x_m^j,y_m^j\}_{j=1}^{J}

 Algorithm

Inner loop

  • For all Ti={τ1,,τi}\mathcal{T_i}=\{\tau_1,\cdots,\tau_i\}, sample K data points Ds\mathcal{D}^s each and compute gradient descent.
    • θi=θαθLTi(fθ)\theta_i' = \theta - \alpha\nabla_\theta\mathcal{L}_{\mathcal{T_i}}(f_\theta)
  • Get ii number of inner loop parameters
  • Sample Dq\mathcal{D}^q for meta update (outer loop)
  • For each task τm\tau_m, we have θm,Dmq\theta_m',\mathcal{D}_m^q where m={1,,i}m = \{1,\cdots,i\}

Outer loop

  • For each tasks τm\tau_m, compute Lm(fθ,Dq)\mathcal{L}_m(f_{\theta'},\mathcal{D}^q)
  • Compute gradient descent with sum of all losses.
    • θθβθTiP(T)LTi(fθi)\theta \leftarrow \theta-\beta\nabla_\theta\displaystyle\sum_{\mathcal{T_i\sim P(\mathcal{T})}}\mathcal{L}_{\mathcal{T_i}}(f_{\theta'_{i}})
    • Note that, since θi=θαθLTi(fθ)\theta_i' = \theta - \alpha\nabla_\theta\mathcal{L}_{\mathcal{T_i}}(f_\theta), θ\theta is driven with 2nd order differentiation.

    Concequently, the meta-object is as follows:

Meta test step (Meta performance test)

  • Sample batch of task Tip(T)/Ti\mathcal{T}_i^* \sim p(\mathcal{T})/\mathcal{T}_i
  • Test it on full algorithm
    • Purpose of the model includes training.

3. Species of MAML

 3-1. Supervised Regression and Classification

  MSE for regression

  CE-loss for classification  

  MAML for Few-Shot Supervised Learning

  • Note that, K-shot classification tasks use K input/output pairs from each class, thus NK data points for N-way classification.

3.2.Reinforcement Learning

  Definitions

  1. TT = { L(x1,a1,...,xH,aH),q(x1),q(xt+1xt,at),HL(x_{1}, a_{1}, . . . , x_{H}, a_{H}), q(x_{1}), q(x_{t+1}|x_{t}, a_{t}), H }
    TT : Task (Each learning problems)
  2. L(xt,at)L(x_{t},a_{t}) : Loss function with observation xtx_{t}, output ata_{t}
  3. q(xt)q(x_{t}) : A distribution of initial observations
  4. q(xt+1xt,at)q(x_{t+1}|x_{t}, a_{t}) : A transition distribution
  5. HH : An episode length. A cycle length of generating an output of a query set. (Each time tt, model generates samples of length H by chooosing an output ata_{t})

 Algorithm

Terminology

  • task : (Classification for example) Each work given specific classes to perform classification, where this specific classes may not include whole class range. Thus, there might be more than one tasks under one dataset.
profile
HAPPY the cat

0개의 댓글