Position-aware Graph Neural Networks

Jiaxuan You, Rex Ying, Jure Leskovec

Proposed Approach

The Framework of P-GNNs

key components:

k anchor-sets of different sizes
message computation function $$F$$, combines feature information of 2 nodes with their network distance
matrix $$M$$ of anchor-set messages, each row is one anchor-set message $$M_i$$ computed by $$F$$
aggregation functions $$AGG{M}, AGG{S}$$, aggregate/transform feature information of the nodes in the anchor-set and aggregate it across the anchor-set
vector $$w$$ that projects $$M$$ to a lower-dimensional embedding space $$z$$

P-GNN训练过程：有多层P-GNN layers

第 $$l$$ 层首先随机sample $$k$$ 个random anchor-sets $$S_i$$
在output node embedding $$z_v$$ 中第 $$i$$ 个维度表示从第 $$i$$ 个anchor-sets中得到的message information
1. 这个embedding的每一个维度计算：首先通过message computation function $$F$$ 计算和这个anchor-set中每一个node的距离。然后使用message aggregation function $$AGG_{M}$$。最后利用non-linear得到一个scalar。
2. 注意每一个node的message都同时包含了node position的distance和node feature information。

Achor-set Selection

Bourgain's Theorem: Given any finite metric space $$(V,d)$$ with $$|V|=n$$, there exists an embedding of $$(V,D)$$ into $$\mathbb{R}^k$$ under any $$l_p$$ metric, where $$k=O(\log ^2 n)$$, and the distortion of the embedding is $$O(\log n)$$.

distortion的意思是衡量两个metric space在embedding前后distance的维持情况。

根据Bourgain's Theorem，P-GNNs可以认为是符合这个的embedding embedding method的泛化。它提供了两个insights：

至少要 $$O(\log^2 n)$$ anchor-sets老保证low distortion embedding
这些anchor-sets的大小是exponentially分布的

根据以上的原则，P-GNNs选择$$k = c \log^2 n$$ random anchor-sets。对于每一个anchor-set $$S_{i,j}$$，we sample each node independently with $$1 / 2^i$$。

Design decisions for P-GNNs

Message Computation Function $$F$$

Position-based similarities可以通过shortest path distance。但是复杂度太高，因此使用q-hop shortest path distance：

$$ d^q{sp} (v,u) = \begin{cases} d{sp}(u,v),&\text{if }d_{sp}(u,v) \le q\ \infty, &\text{otherwise} \end{cases}

其中$$d{sp}$$是shortest path。注意1-hop distance就是adjacency matrix。因为我们需要的是相似度，因此这么转换为distance $$s(u,v) = \frac{1}{d{sp}^q(u,v)+1}$$。

Feature information 可以这么融合到node $$h_u$$ 中：

通过将neighborhood nodes的信息利用传递，类似GCN的模型
将node features $$h_u$$ 和 $$h_v$$ 给 concatenate，类似GraphSAGE
其他方法类似attention

之后就可以用concatentation或者product的方法，将position和feature结合。我们发现product比较好用。specifically，我们使用以下的message passing function：

$$F(u,v,h_u,h_v) = s(v,u) \cdot CONCAT(h_v, h_u)$$

Message Aggregation Functions $$AGG$$

可以用任意的ordering invariant function，比如max，min，sum，mean。

Position-aware Graph Neural Networks

Position-aware Graph Neural Networks