Generalization Properties and Implicit Regularization for Multiple Passes SGM
Generalization Properties and Implicit Regularization for Multiple Passes SGM
Junhong Lin, etc.
Intro
将early-stopping和step-size作为implicit regularization进行探究。
在训练模型的时候,one pass over data需要fine-tune step size,而multiple pass则可以有一个universal step size。
Learning with SGM
SGD
$$ w{t+1} = w_t - \eta_t V'(y{jt}, \langle w_t, \Phi(x{jt}) \Phi(x{j_t} \rangle)
$$
其中$$j_t$$是t时刻用的训练数据。
这篇paper是为了预估expected excess risk
$$ \mathbb{E}_{z,J}[\mathcal{E}(w_T) - inf \mathcal{E}(w)]
$$