文论瞎读:Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec
Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec
论文地址:https://arxiv.org/abs/1710.02971
简介:
本文由微软和清华合作,在2018年发表,文章将DeepWalk, LINE, PTE, and node2vec算法在MF框架下进行了统一,在此基础上提出了NetMF算法
背景:
各种network embedding models很流行,但是对这些方法背后的理论分析却很少,之前的一些工作有:
Omer Levy and Yoav Goldberg. 2014. Neural Word Embedding as Implicit Matrix Factorization. In NIPS. 2177–218
(解释了skip-gram中负采样是对word-content矩阵的隐式分解)
Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. 2016. A latent variable model approach to pmi-based word embeddings. TACL 4 (2016), 385–399
Tatsunori B Hashimoto, David Alvarez-Melis, and Tommi S Jaakkola. 2016. Word embeddings as metric recovery in semantic spaces. TACL 4 (2016), 273–286
(解释了 the word embedding models from geometric perspectives)
但是对word-context matrix 和 network structure的连接的分析仍然很少,对这类算法背后的联系的挖掘也不够。
贡献:
1. 提出 DeepWalk, LINE, PTE, and node2vec—are in theory performing implicit matrix factorization
2. LINE can be seen as a special case of DeepWalk, when the window size T of contexts is set to 1 in skip-gram
3. PTE, as an extension of LINE, is actually an implicit factorization of the joint matrix of multiple networks
4. Discover a theoretical connection between DeepWalk’s implicit matrix and graph Laplacians,在此基础上提出NetMF
方法简介:
1.LINE
LINE的目的是对一个G=(V, E,A)的无向图,学习得到两个表征矩阵X和Y,其中
The objective of LINE (2nd) is to maximize:
g 是sigmoid function
b 是负采样系数
PN是用于负采样的 noise distribution
2. PTE
PTE是LINE在 heterogeneous text network的延伸,对于一个G = (V1∪V2,E,A) ,其任务是找到一个
xi表示vi属于V1, yj表示vj属于V2,其优化目标如下:
3. DeepWalk
先用random walk生成一系列的“corpus”再用skip-gram对“corpus”进行embedding,文章中给出了DeepWalk的基本步骤:
4. Node2vec
与Deep walk类似,加入了2度节点的转移概率计算方式,将2度节点加入random walk,对G = (V,E,A)
(u,v,w)都是G的节点,E为边集合
结论:
作者通过数学推导证明如下结论,过程太多不贴了
NetMF:
先对输入矩阵M进行一个预处理后再进行MF,其中T为窗口大小
文章还分别对大窗口和小窗口设计了不同的优化方式: