PotentialNet for Molecular Property Prediction

Evan N. Feinberg, Vijae Pandey

Neural Network Architectures

Ligand-Based Scoring Models

Fully Connected Neural Networks

Graph Convolutional Neural Networks $$hi^{(t+1)} = U^{(t)} \bigg( h_i^{(t)}, \sum{v_j \in N(v_i)} m^{(t)}(h_i^{(t)}) \bigg)$$

$$h_i^{(t)}$$表示node $$i$$在第$$t$$层之后的特征，$$N(v_i)$$表示node $$i$$的邻居，$$U^{(t)}$$ 和 $$m^{(t)}$$ 是第$$t$$层layer的update和message functions。$$m^{(t,e)}$$是edge type $$e$$ message function。

我们的模型是由gated graph neural networks (GGNN)。在所有的layer中，update function类似gated recurrent unit (GRU)。Message function是简单的linear operations，它对每一个edge type不一样，但是不同layer上是一样的（？）:

$$hi^{(t+1)} = GRU \bigg( h_i^{(t)}, \sum_e^{N{et}} W^{(e)} A^{(e)} h^{(t)} \bigg)$$

GCNN最后一层是graph gather不在，将final embeddings求一个sum up，并且这一步是invariant to node ordering (但是前面的message passing layer是否是ordering invariant？）

Structure-Based Scoring Models

PotentialNet Architectures for Molecular Property Prediction

也考虑到target的structural information。

$$R$$是distance matrix，然后将低于某一个threshold的作为adjacency matrix (因此包含了一定的空间信息)。

因为其特殊的构造，（稀疏性），A和R都可以表示成block matrix。（用BFS走一遍）

除了edge type generalization，我们也介绍了graph convolution layer中的非线性：

$$hi^{(K)} = GRU \bigg( h_i^{(K-1)}, \sum{j \in N^{(e)}_{(v_i)}} NN^{(e)} (h_j^{(K-1)}) \bigg)$$

PotentialNet包含三个部分

covalent-only propagation：只将bond（ligand）信息包含在graph convolution中
dual noncovalent and covalent propagation：考虑到bond-based和spatial distance-based propagation
ligand-based graph gather：graph gather operation，在ligand atoms上，而每一个atom的特征都由上一步从bond和distance propagation上面来

PotentialNet for Molecular Property Prediction

PotentialNet for Molecular Property Prediction