Learning to Compare: Relation Network for Few-Shot Learning

3 Method

3.1 Problem Definition

K labeled examples for each C unique classes。C-way K-shot learning。

每一个training iteration，都选择K labeled samples from each of the C class in the training set，这个选出来的叫做sample set，而剩余的叫做query set。

3.2 Model

One-shot 包括了两个部分：embedding module和relation module。从sample set S和query set Q中选择样本pass through embedding module。得到的feature map $$f\phi(x_i), f\phi(xj)$$ 会被操作于operator，$$C(f\phi(xi), f\phi(x_j))$$。这里的$$C$$是concatenation。

而relation module就基于这个，会产生一个scalar。

$$r{i,j} = g\phi(C(f\phi(x_i), f\phi(x_j)))$$, where $$i=1, 2, ..., C$$

K-Shot 和上面类似。只不过对于$$x_i$$，我们表示为某一个class的feature map，就将这个类对应的K个sample feature做一个sum-up。然后这个pooled class-level feature map会和上述的query image feature map进行结合。

Objective Function 用MSE，哪怕目标是一个binary label。

这个选择并不常见。但是因为我们在预测的score可以被认为是一个regression problem，尽管ground truth是binary。

3.3 Zero-Shot Learning

类似于one-shot learning，但并没有用到support set（每一个class都包含image），而是用了semantic class embedding $$v_c$$。（？）

$$r{i,j} = g\phi (C(f{\psi_1}(v_c), f{\psi_2}(x_j)))$$, where $$i=1,2,...,C$$

3.4 Network Architecture

CNN作为relation module。

Learning to Compare: Relation Network for Few-Shot Learning

Learning to Compare: Relation Network for Few-Shot Learning