Spectrally-normalized margin bounds for neural networks

Spectrally-normalized margin bounds for neural networks

Peter L. Bartlett, Dylan J. Foster, Matus Telgarsky

Intro

In deep learning, #param >> #sample.

  • VC theory, $$VC = O(pL)$$, where p is # params, L is # layers
  • Neuralnet
    • CIFAR10, but with random label
    • test error is very highs.
  • Margin analysis
    • Linear classifier => neuralnet
    • Margin distribution

Figure 2展示了,更加容易学习的data的margin distribution应该更加平。

  • Spectral norm: $$A_*$$
  • Neuralnet: $$F_A(x) = \sigma_L(A_L(...))$$
    • L layers
  • Spectral complexity
    • $$RA = \prod p_i |A_i| (\sum{i=1}^L \frac{|A_i^T - M_i^T|{2,1}^{2/3}}{|Ai|^{2/3}})$$
    • $$M_i = I$$ is the resnet, $$(Ax+x) = (W+I)x$$

Theorem 1.1

  • *
    • If $$|x_i| \le B$$, $$|x| \le \sqrt{n}B$$
    • $$F_A \le r$$
    • return $$l_n$$ terms

告诉我们,如何得到一个bound,不依赖于dimension

Analysis of margin bound

ramp loss

  1. step 1, covering bound per layer
  2. step 2, induction
  3. step 3, whole network lowering bound

Appendix

正好是Simons Institute课程期间的论文。

results matching ""

    No results matching ""