图卷积神经网络GCN--自动编码器代表作

Graph Auto-encoder

1 Structural Deep Network Embedding

[Wang D, 2016, 1] 提出的SDNE使用自动编码器思想对图数据经行嵌入。原文提出了一阶、二阶近似,希望编码前后的顶点一阶、二阶近似关系保持不变。

  • First-Order Proximity The first-order proximity describes the pairwise proximity between vertexes. For any pair of vertexes, if aij>0a_{ij} > 0, there exists positive first-order proximity between viv_i and vjv_j. Otherwise, the first-order proximity between viv_i and vjv_j is 0.

  • Second-Order Proximity The second-order proximity between a pair of vertexes describes the proximity of the pair’s neighborhood structure. Let Nu={au,1,,au,V}\mathcal{N}_u = \{ a_{u,1}, \cdots, a_{u,|\mathcal{V}|} \} denote the first-order proximity between vuv_u and other vertexes. Then, secondorder proximity is determined by the similarity of Nu\mathcal{N}_u and Nv\mathcal{N}_v.

卷积:
yi(0)=xi,yi(k)=σ(W(k)yi(k1)+b(k)),k=1,,K.(1.1) \begin{aligned} \vec{y}_i^{(0)} &= \vec{x}_i, \\ \vec{y}_i^{(k)} &= \sigma \left( W^{(k)} \vec{y}_i^{(k-1)} + \vec{b}^{(k)} \right), \quad k = 1, \cdots, K. \end{aligned} \tag{1.1}

重构损失:
L=i=1nx^ixi22.(1.2) \mathcal{L} = \sum_{i=1}^{n} \| \hat{\vec{x}}_i - \vec{x}_i \|_2^2. \tag{1.2}

一阶近似损失:
L1st=i,j=1nai,jyi(K)yj(K)22.(1.3) \mathcal{L}_{\text{1st}} = \sum_{i,j=1}^{n} a_{i,j} \| \vec{y}_i^{(K)} - \vec{y}_j^{(K)} \|_2^2. \tag{1.3}

二阶近似损失:
L2nd=i=1n(x^ixi)bi22,=(X^X)BF2.(1.4) \begin{aligned} \mathcal{L}_{\text{2nd}} &= \sum_{i=1}^{n} \| \left( \hat{\vec{x}}_i - \vec{x}_i \right) \odot \vec{b}_i \|_2^2,\\ &= \| \left( \hat{X} - X \right) \odot B \|_F^2. \end{aligned} \tag{1.4}

ai,j=0a_{i,j}=0时,bi=1\vec{b}_i = \vec{1};如果ai,j>0a_{i,j}>0,则bi\vec{b}_i是一个大于1的超参数。通过自编码器,如果两个节点具有相近的邻接点结构,则在表示空间中距离越近。

惩罚项:
Lreg=12k=1K(W(k)F2+W^(k)F2).(1.5) \mathcal{L}_{\text{reg}} = \frac{1}{2} \sum_{k=1}^{K} \left( \| W^{(k)} \|_F^2 + \| \hat{W}^{(k)} \|_F^2 \right). \tag{1.5}

总的损失为:
Lmix=L2nd+αL1st+γLreg.(1.6) \mathcal{L}_{\text{mix}} = \mathcal{L}_{\text{2nd}} + \alpha \mathcal{L}_{\text{1st}} + \gamma \mathcal{L}_{\text{reg}}. \tag{1.6}

图卷积神经网络GCN--自动编码器代表作

2 Deep neural networks for learning graph representations

[Cao S, 2016, 2] DNGR结合了随机游走和深度自动编码器。该模型由3部分组成:随机游走、正点互信息(PPMI)计算和叠加去噪自编码器。

  1. 随机游走:

给定游走最大步长和邻居矩阵:
pk=αpk1A+(1α)p0.(2.1) p_k = \alpha p_{k-1} A + (1 - \alpha) p_0. \tag{2.1}
其中α\alpha是平滑参数。

  1. PPMI矩阵:
    用随机游走得到的概率共发生矩阵计算PPMI矩阵,
    PMIw,c=log(#(w,c)w,c#(w,c)#(w)#(v)),Xw,c=PPMIw,c=max(PMIw,c,0).(2.2) \begin{aligned} \text{PMI}_{w,c} &= \log \left( \frac{\#(w,c) \cdot \sum_{w,c} \#(w,c)}{\#(w) \cdot \#(v)} \right), \\ X_{w,c} &= \text{PPMI}_{w,c} = \max (\text{PMI}_{w,c}, 0). \end{aligned} \tag{2.2}

  2. XX作自动编码器的输入。

损失为:
minθ1,θ2i=1NL(x(i),gθ2(fθ1(x~(i)))).(2.3) \min_{\theta_1, \theta_2} \sum_{i=1}^{N} \mathcal{L} \left( x^{(i)}, g_{\theta_2} ( f_{\theta_1} ( \tilde{x}^{(i)} ) ) \right). \tag{2.3}

3 Variational Graph Auto-Encoders

[Kipf T, 2016, 3] 将VAE应用于图数据上。对于无权重图G=(V,E)\mathcal{G} = (\mathcal{V},\mathcal{E})ARN×NA \in \reals^{N \times N}是邻居矩阵,DRN×ND \in \reals^{N \times N}是度对角阵,XRN×DX \in \reals^{N \times D}是顶点的特征向量矩阵,ZRN×FZ \in \reals^{N \times F}是编码后的特征向量矩阵。

  • Inference model

Inference model主要的是生成XX的新表达,即编码。
q(ZX,A)=i=1Nq(ziX,A).(3.1) q(Z | X, A) = \prod_{i=1}^{N} q(\vec{z}_i | X, A). \tag{3.1}
其中:
q(ziX,A)=N(ziμi,diag(σi2)),μ=GCNμ(X,A),logσ=GCNσ(X,A),GCN(X,A)=A~ReLU(A~XW0)W1,A~=D12AD12.(3.2) \begin{aligned} q(\vec{z}_i | X, A) &= \mathcal{N} \left( \vec{z}_i | \vec{\mu}_i, \text{diag}(\vec{\sigma}_i^2) \right) ,\\ \vec{\mu} &= \text{GCN}_{\mu}(X, A),\\ \log \vec{\sigma} &= \text{GCN}_{\sigma}(X, A),\\ \text{GCN}(X, A) &= \tilde{A} \text{ReLU}(\tilde{A} X W_0) W_1,\\ \tilde{A} &= D^{-\frac{1}{2}} A D^{-\frac{1}{2}}. \end{aligned} \tag{3.2}
GCNμ(X,A)\text{GCN}_{\mu}(X, A)GCNσ(X,A)\text{GCN}_{\sigma}(X, A)共享W0W_0

  • Generative model

Generative model是根据编码ZZ生成邻接矩阵,重构图。

P(AZ)=i=1Nj=1Np(Aijzi,zj).(3.3) P(A|Z) = \prod_{i=1}^{N} \prod_{j=1}^{N} p(A_{ij}|\vec{z}_i, \vec{z}_j). \tag{3.3}

其中:
p(Aij=1zi,zj)=σ(ziTzj).(3.4) p(A_{ij} = 1|\vec{z}_i, \vec{z}_j) = \sigma (\vec{z}_i^T \vec{z}_j). \tag{3.4}

  • Learning

损失函数:
L=Eq(ZX,A)[logp(AZ)]KL[q(ZX,A)p(Z)].(3.5) \mathcal{L} = \mathbb{E}_{q(Z | X, A)} \left[ \log p (A|Z) \right] - \text{KL} \left[ q(Z | X, A) \| p(Z) \right]. \tag{3.5}

原文还提出了Non-probabilistic GAE:
A^=σ(ZZT),Z=GCN(X,A).(3.6) \begin{aligned} \hat{A} &= \sigma(Z Z^T),\\ Z &= \text{GCN}(X, A). \end{aligned} \tag{3.6}

4 MGAE: Marginalized Graph Autoencoder for Graph Clustering

[Wang C, 2017, 4] 显示给出了一层卷积后的最佳参数WW
XD~12A~D~12XW2+λWF2.(4.1) \| X - \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} X W \|^2 + \lambda\| W \|_F^2. \tag{4.1}

对输入的图随机选择保留mm个顶点,得到[X~1,,X~m][ \widetilde{X}_1, \cdots, \widetilde{X}_m ],记Xˉ=1mi=1mX~i\bar{X} = \frac{1}{m} \sum_{i=1}^{m}\widetilde{X}_i,则:
W=XˉT(D~12A~D~12)TX(barXT(D~12A~D~12)T(D~12A~D~12)X)1.(4.2) W = \bar{X}^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right)^T X \left( bar{X}^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right)^T \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right) X \right)^{-1}. \tag{4.2}

编码器为:
Z=(D~12A~D~12)XW.(4.3) Z = \left( \widetilde{D}^{-\frac{1}{2}} \widetilde{A} \widetilde{D}^{-\frac{1}{2}} \right) X W. \tag{4.3}

图卷积神经网络GCN--自动编码器代表作

5 Learning Deep Network Representations with Adversarially Regularized Autoencoders

[Yu W, 2018, 5] 在自动编码器上加入对抗生成网络。

图卷积神经网络GCN--自动编码器代表作

这个网络需要先由随机游走作为网络的输入,然后训练编码器、解码器、生成器、判别器。训练过程如下图。

图卷积神经网络GCN--自动编码器代表作

6 Deep Recursive Network Embedding with Regular Equivalence

[Tu K, 2018, 6] 提出了两种等价定义。

  • (Structural Equivalence) We denote s(u)=s(v)s(u) = s(v) if nodes uu and vv are structurally equivalent. Then s(u)=s(v)s(u) = s(v) if and only if N(u)=N(v)\mathcal{N}(u) = \mathcal{N}(v).

  • (Regular Equivalence) We denote r(u)=r(v)r (u) = r (v) if nodes uu and vv are regularly equivalent. Then r(u)=r(v)r (u) = r (v) if and only if {r(i)iN(u)}={r(j)jN(u)}\{r (i)|i \in \mathcal{N}(u) \} = \{r (j)|j \in \mathcal{N}(u)\}.

嵌入过程:

  1. 对一个顶点找出邻居顶点,将它们的特征向量按顶点的度排序;
  2. 输入到LSTM中;
  3. LSTM输出便是顶点的重构。

图卷积神经网络GCN--自动编码器代表作

7 Adversarially Regularized Graph Autoencoder for Graph Embedding

[Pan S, 2018, 7] 在[Kipf T, 2016, 3]的基础上更换了卷积操作,并且加入了对抗机制,但是与[Yu W, 2018, 5]不同,只有判别器没有生成器。

非概率编码器重构损失:
L0=Eq(Z(X,A))[logp(A^Z)].(7.1) \mathcal{L}_0 = \mathbb{E}_{q(Z|(X,A))} [\log p(\hat{A}|Z)]. \tag{7.1}

变分编码器重构损失:
L1=Eq(ZX,A)[logp(A^Z)KL[q(ZX,A)p(Z)]].(7.2) \mathcal{L}_1 = \mathbb{E}_{q(Z|X,A)} \left[ \log p(\hat{A}|Z) - \text{KL} \left[ q(Z | X, A) \| p(Z) \right] \right]. \tag{7.2}

对抗损失:
minGmaxDEzpz[logD(Z)]+Exp(x)[log(1D(G(X,A)))].(7.3) \min_{\mathcal{G}} \max_{\mathcal{D}} \mathbb{E}_{\vec{z} \sim p_z} \left[ \log \mathcal{D}(Z) \right] + \mathbb{E}_{\vec{x} \sim p(\vec{x})} \left[ \log \left( 1 - \mathcal{D}(\mathcal{G}(X,A)) \right) \right]. \tag{7.3}

图卷积神经网络GCN--自动编码器代表作

参考文献