Strategies for Pre-Training Graph Neural Networks
Strategies for Pre-Training Graph Neural Networks
Weihua Hu, etc.
3 Strategies for Pre-Training Graph Neural Networks
3.1 Node-Level Pre-Training
提出了两类self-supervised methods,Context Prediction和Attribute Masking
3.1.1 CONTEXT PREDICTION: EXPLOITING DISTRIBUTION OF GRAPH STRUCTURE
context prediction是用subgraph来预测其周围的graph structures。
Neighborhood and context graphs K-hop neighborhood是一个老概念了。而每一个节点v的context graph是v的neighborhood的neighborhood,它有两个参数定义,是在 r1-hop 和 r2-hop 之间的整个subgraph区域。要求 r1<K,这样neighborhood graph和context graph会有部分交集,而这部分交集被叫做anchor nodes。
Encoding context into a fixed vector using an Auxiliary GNN 为了能够实现context prediction,将context graph映射为固定长度的vector。这里就用了context GNN。用context GNN 得到 node embedding,然后将anchor nodes embedding进行平均一下,来作为context embedding。
Learning via negative sampling 用negative sampling来jointly learn main GNN 和 context GNN。main GNN就是利用neighborhood来获得node embedding,而context GNN则是利用context graph GNN来得到context embedding。而context prediction的目标是,某个特定的neighborhood和context是属于同一个node。
这个训练结束之后,main GNN就被当作pre-trained model。
3.1.2 ATTRIBUTE MASKING: EXPLOITING DISTRIBUTION OF GRAPH ATTRIBUTES
Masking nodes and edge attributes 想法很简单,讲一个node和edge进行mask,然后用GNN来学习它们的attributes。有点类似BERT,但是BERT中是假设句子里面的words fully connected,而这里要遵守graph topology(这就像把message passing和attention比较)。
3.2 GRAPH-LEVEL PRE-TRAINING
通过上面描述的node embedding,来得到graph embedding。有两类应用:一个是对整个graph的性质进行prediction,另外一个是对graph structure进行prediction。
3.2.1 SUPERVISED GRAPH-LEVEL PROPERTY PREDICTION
通过将node-level embedding和graph-level embedding一起学习。
3.2.2 STRUCTURAL SIMILARITY PREDICTION
另外一个方法是,预测structural similarity of two graphs。比如graph edit distance或者graph similarity。但是ground truth比较难获得。综合考虑,这里就先不用这个方法了。
3.3 OVERVIEW: PRE-TRAINING GNNS AND FINE-TUNING FOR DOWNSTREAM TASKS
综上,先有了node-level self-supervised pre-training,然后有了graph-level multi-task supervised pre-training。最后用到downstream tasks。