DeeplyTough: Learning Structural Comparison of Protein Binding Sites

Martin Simonovsky, Joshua Meyers

Intro

pocket matching

这里提出使用CNN将3D protein信息压缩成descriptors。

System and Methods

Volumetric Input Representation

对于pocket detection和protein-ligand affinity prediction，我们将protein structure当作3D image，有c channels。

这里选择c=8，分别是[atom presence, hydrophobicity, aromaticity, ability to accept or donate, positive or negative ionizability, being metallic].

occupancy则用atom 的 smooth indication function of the van der Waals raddi来表示。如下

$$ f(x)h = \max{a \in \mathcal{A}_h} 1 - \exp(-(\frac{r_a}{| x - x_a |_2})^{12})

其中$$h$$表示channel，$$r_a$$表示van der Waals radius，$$x_a$$表示atom a的位置。也就是对于非atom的点进行某种interpolation。

featurization使用的是high-throughput molecular dynamics (HTMD) package。

Learning Pocket Descriptors

descriptor learning

任务就是学习两个cluster，分别对应positive和negative。

对于任意一个pair of pockets，其距离定义为

$$ Lc(p_1, p_2) = \begin{cases} | d\theta (p1) - d\theta (p2) |_2^2, & Q \in P\ \max(0, m - | d\theta (p1) - d\theta (p_2) |_2)^2, & Q \in N \end{cases}

其中的descriptor function $$d_\theta$$ 就是CNN。

Network Architecture

使用3D steerable CNN [1]. steerable kernel basis, 使得parameter变得很少，只有$$10^5$$。需要查一下[1]。

Appendix

[1] Weiler, M. et al. (2018). 3D steerable CNNs: Learning rotationally equivariant features in volumetric data. arXiv preprint arXiv:1807.02547

DeeplyTough: Learning Structural Comparison of Protein Binding Sites

DeeplyTough: Learning Structural Comparison of Protein Binding Sites