Variational Graph Auto-Encoders

Thomas N. Kipf, Max Welling

A latent variable model for graph-structured data

使用graph convolutional network (GCN) encoder and inner product decoder。

adjacency matrix $$A$$, degree matrix $$D$$.

stochastic latent variables $$z_i$$, summerized in an $$N \times F$$ matrix $$Z$$.

Node features in an $$N \times D$$ matrix $$X$$

Inference model: 两层GCN

$$ q(Z|X,A) = \prod_{i=1}^N q(z_i | X,A), \text{with }q(z_i |X,A) = \mathcal{N}(z_i | \mu_i, \text{diag}(\sigma_i^2))

这里$$\mu = GCN{\mu} (X,A)$$是matrix of $$\mu_i$$ (mean vector)，而相似的 $$\log \sigma = GCN\sigma (X,A)$$。two-layer GCN是经典的定义。

Generative model: 使用latent variable之间的inner product实现（但这个在多种edge type下就不能直接套用）

$$p(A|Z) = \prod{i=1}^N \prod{j=1}^N p(A{ij} | z_i, z_j)$$, with $$p(A{ij}=1|z_i, z_j) = \sigma (z_i^T z_j)$$

Learning: 通过优化variational lower bound $$\mathcal{L}$$ w.r.t. variational parameters $$W_i$$

$$ \mathcal{L} = \mathbb{E}_{q(Z|X,A)} [\log p(A|Z)] - KL[q(Z|X,A) || p(Z)]

此处$$Z$$作为reparameterization parameter，使用Gaussian prior $$p(Z) = \prod_i p(z_i) = \prod_i \mathcal{N}(z_i | 0, I)$$。（其实也可以考虑multivariate bernoulli）

Non-probabilistic graph auto-encoder (GAE) model:

$$\hat A = \sigma (ZZ^T)$$, with $$Z = GCN(X,A)$$

NIPS Workshop on Bayesian Deep Learning (2016)#