Graph Auto-encoder

文章目录

Graph Auto-encoder

1 Structural Deep Network Embedding
2 Deep neural networks for learning graph representations
3 Variational Graph Auto-Encoders
4 MGAE: Marginalized Graph Autoencoder for Graph Clustering
5 Learning Deep Network Representations with Adversarially Regularized Autoencoders
6 Deep Recursive Network Embedding with Regular Equivalence
7 Adversarially Regularized Graph Autoencoder for Graph Embedding
参考文献

1 Structural Deep Network Embedding

[Wang D, 2016, 1] 提出的SDNE使用自动编码器思想对图数据经行嵌入。原文提出了一阶、二阶近似，希望编码前后的顶点一阶、二阶近似关系保持不变。

First-Order Proximity The first-order proximity describes the pairwise proximity between vertexes. For any pair of vertexes, if $a_{ij} > 0$ , there exists positive first-order proximity between $v_i$ and $v_j$ . Otherwise, the first-order proximity between $v_i$ and $v_j$ is 0.
Second-Order Proximity The second-order proximity between a pair of vertexes describes the proximity of the pair’s neighborhood structure. Let $\mathcal{N}_u = \{ a_{u,1}, \cdots, a_{u,|\mathcal{V}|} \}$ denote the first-order proximity between $v_u$ and other vertexes. Then, secondorder proximity is determined by the similarity of $\mathcal{N}_u$ and $\mathcal{N}_v$ .

卷积：
$\begin{aligned} \vec{y}_i^{(0)} &= \vec{x}_i, \\ \vec{y}_i^{(k)} &= \sigma \left( W^{(k)} \vec{y}_i^{(k-1)} + \vec{b}^{(k)} \right), \quad k = 1, \cdots, K. \end{aligned} \tag{1.1}$

重构损失：
$\mathcal{L} = \sum_{i=1}^{n} \| \hat{\vec{x}}_i - \vec{x}_i \|_2^2. \tag{1.2}$

一阶近似损失：
$\mathcal{L}_{\text{1st}} = \sum_{i,j=1}^{n} a_{i,j} \| \vec{y}_i^{(K)} - \vec{y}_j^{(K)} \|_2^2. \tag{1.3}$

二阶近似损失：
$\begin{aligned} \mathcal{L}_{\text{2nd}} &= \sum_{i=1}^{n} \| \left( \hat{\vec{x}}_i - \vec{x}_i \right) \odot \vec{b}_i \|_2^2,\\ &= \| \left( \hat{X} - X \right) \odot B \|_F^2. \end{aligned} \tag{1.4}$

当 $a_{i,j}=0$ 时， $\vec{b}_i = \vec{1}$ ；如果 $a_{i,j}>0$ ，则 $\vec{b}_i$ 是一个大于1的超参数。通过自编码器,如果两个节点具有相近的邻接点结构,则在表示空间中距离越近。

惩罚项：
$\mathcal{L}_{\text{reg}} = \frac{1}{2} \sum_{k=1}^{K} \left( \| W^{(k)} \|_F^2 + \| \hat{W}^{(k)} \|_F^2 \right). \tag{1.5}$

总的损失为：
$\mathcal{L}_{\text{mix}} = \mathcal{L}_{\text{2nd}} + \alpha \mathcal{L}_{\text{1st}} + \gamma \mathcal{L}_{\text{reg}}. \tag{1.6}$

图卷积神经网络GCN--自动编码器代表作

2 Deep neural networks for learning graph representations

[Cao S, 2016, 2] DNGR结合了随机游走和深度自动编码器。该模型由3部分组成：随机游走、正点互信息（PPMI）计算和叠加去噪自编码器。

随机游走：

给定游走最大步长和邻居矩阵：
$p_k = \alpha p_{k-1} A + (1 - \alpha) p_0. \tag{2.1}$
其中 $\alpha$ 是平滑参数。

PPMI矩阵：
用随机游走得到的概率共发生矩阵计算PPMI矩阵，
$\begin{aligned} \text{PMI}_{w,c} &= \log \left( \frac{\#(w,c) \cdot \sum_{w,c} \#(w,c)}{\#(w) \cdot \#(v)} \right), \\ X_{w,c} &= \text{PPMI}_{w,c} = \max (\text{PMI}_{w,c}, 0). \end{aligned} \tag{2.2}$
用 $X$ 作自动编码器的输入。

损失为：
$\min_{\theta_1, \theta_2} \sum_{i=1}^{N} \mathcal{L} \left( x^{(i)}, g_{\theta_2} ( f_{\theta_1} ( \tilde{x}^{(i)} ) ) \right). \tag{2.3}$

3 Variational Graph Auto-Encoders

[Kipf T, 2016, 3] 将VAE应用于图数据上。对于无权重图 $\mathcal{G} = (\mathcal{V},\mathcal{E})$ ， $A \in \reals^{N \times N}$ 是邻居矩阵， $D \in \reals^{N \times N}$ 是度对角阵， $X \in \reals^{N \times D}$ 是顶点的特征向量矩阵， $Z \in \reals^{N \times F}$ 是编码后的特征向量矩阵。

Inference model：

Inference model主要的是生成 $X$ 的新表达，即编码。
$q(Z | X, A) = \prod_{i=1}^{N} q(\vec{z}_i | X, A). \tag{3.1}$
其中：
$\begin{aligned} q(\vec{z}_i | X, A) &= \mathcal{N} \left( \vec{z}_i | \vec{\mu}_i, \text{diag}(\vec{\sigma}_i^2) \right) ,\\ \vec{\mu} &= \text{GCN}_{\mu}(X, A),\\ \log \vec{\sigma} &= \text{GCN}_{\sigma}(X, A),\\ \text{GCN}(X, A) &= \tilde{A} \text{ReLU}(\tilde{A} X W_0) W_1,\\ \tilde{A} &= D^{-\frac{1}{2}} A D^{-\frac{1}{2}}. \end{aligned} \tag{3.2}$
$\text{GCN}_{\mu}(X, A)$ 和 $\text{GCN}_{\sigma}(X, A)$ 共享 $W_0$ 。

Generative model：

Generative model是根据编码 $Z$ 生成邻接矩阵，重构图。

$P(A|Z) = \prod_{i=1}^{N} \prod_{j=1}^{N} p(A_{ij}|\vec{z}_i, \vec{z}_j). \tag{3.3}$

其中：
$p(A_{ij} = 1|\vec{z}_i, \vec{z}_j) = \sigma (\vec{z}_i^T \vec{z}_j). \tag{3.4}$

Learning：

损失函数：
$\mathcal{L} = \mathbb{E}_{q(Z | X, A)} \left[ \log p (A|Z) \right] - \text{KL} \left[ q(Z | X, A) \| p(Z) \right]. \tag{3.5}$

原文还提出了Non-probabilistic GAE：
$\begin{aligned} \hat{A} &= \sigma(Z Z^T),\\ Z &= \text{GCN}(X, A). \end{aligned} \tag{3.6}$

4 MGAE: Marginalized Graph Autoencoder for Graph Clustering

[Wang C, 2017, 4] 显示给出了一层卷积后的最佳参数 $W$ ：
$\| X - \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} X W \|^2 + \lambda\| W \|_F^2. \tag{4.1}$

对输入的图随机选择保留 $m$ 个顶点，得到 $[ \widetilde{X}_1, \cdots, \widetilde{X}_m ]$ ，记 $\bar{X} = \frac{1}{m} \sum_{i=1}^{m}\widetilde{X}_i$ ，则：
$W = \bar{X}^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right)^T X \left( bar{X}^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right)^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right) X \right)^{-1}. \tag{4.2}$

编码器为：
$Z = \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right) X W. \tag{4.3}$

图卷积神经网络GCN--自动编码器代表作

5 Learning Deep Network Representations with Adversarially Regularized Autoencoders

[Yu W, 2018, 5] 在自动编码器上加入对抗生成网络。

图卷积神经网络GCN--自动编码器代表作

这个网络需要先由随机游走作为网络的输入，然后训练编码器、解码器、生成器、判别器。训练过程如下图。

图卷积神经网络GCN--自动编码器代表作

6 Deep Recursive Network Embedding with Regular Equivalence

[Tu K, 2018, 6] 提出了两种等价定义。

(Structural Equivalence) We denote $s(u) = s(v)$ if nodes $u$ and $v$ are structurally equivalent. Then $s(u) = s(v)$ if and only if $\mathcal{N}(u) = \mathcal{N}(v)$ .
(Regular Equivalence) We denote $r (u) = r (v)$ if nodes $u$ and $v$ are regularly equivalent. Then $r (u) = r (v)$ if and only if $\{r (i)|i \in \mathcal{N}(u) \} = \{r (j)|j \in \mathcal{N}(u)\}$ .

嵌入过程：

对一个顶点找出邻居顶点，将它们的特征向量按顶点的度排序；
输入到LSTM中；
LSTM输出便是顶点的重构。

图卷积神经网络GCN--自动编码器代表作

7 Adversarially Regularized Graph Autoencoder for Graph Embedding

[Pan S, 2018, 7] 在[Kipf T, 2016, 3]的基础上更换了卷积操作，并且加入了对抗机制，但是与[Yu W, 2018, 5]不同，只有判别器没有生成器。

非概率编码器重构损失：
$\mathcal{L}_0 = \mathbb{E}_{q(Z|(X,A))} [\log p(\hat{A}|Z)]. \tag{7.1}$

变分编码器重构损失：
$\mathcal{L}_1 = \mathbb{E}_{q(Z|X,A)} \left[ \log p(\hat{A}|Z) - \text{KL} \left[ q(Z | X, A) \| p(Z) \right] \right]. \tag{7.2}$

对抗损失：
$\min_{\mathcal{G}} \max_{\mathcal{D}} \mathbb{E}_{\vec{z} \sim p_z} \left[ \log \mathcal{D}(Z) \right] + \mathbb{E}_{\vec{x} \sim p(\vec{x})} \left[ \log \left( 1 - \mathcal{D}(\mathcal{G}(X,A)) \right) \right]. \tag{7.3}$

图卷积神经网络GCN--自动编码器代表作

图卷积神经网络GCN--自动编码器代表作

Graph Auto-encoder

文章目录

1 Structural Deep Network Embedding

2 Deep neural networks for learning graph representations

3 Variational Graph Auto-Encoders

4 MGAE: Marginalized Graph Autoencoder for Graph Clustering

5 Learning Deep Network Representations with Adversarially Regularized Autoencoders

6 Deep Recursive Network Embedding with Regular Equivalence

7 Adversarially Regularized Graph Autoencoder for Graph Embedding

参考文献

相关推荐