【KDD 2019】Is a Single Vector Enough Exploring Node Polysemy for Network Embedding

摘要

Networks have been widely used as the data structure for abstracting real-world systems as well as organizing the relations among entities. Network embedding models are powerful tools in mapping nodes in a network into continuous vector-space representations in order to facilitate subsequent tasks such as classification and link prediction.

网络作为抽象现实系统和组织实体间关系的数据结构被广泛应用。网络嵌入模型是将网络中的节点映射为连续向量空间表示的有力工具,以便于后续的分类和链路预测等任务。

Existing network embedding models comprehensively integrate all information of each node, such as links and attributes,towards a single embedding vector to represent the node’s general role in the network. However, a real-world entity could be multifaceted, where it connects to different neighborhoods due to different motives or self-characteristics that are not necessarily correlated.

现有的网络嵌入模型综合集成了每个节点的所有信息,如链接和属性,指向一个单一的嵌入向量来表示节点在网络中的一般角色。然而,一个真实世界的实体可能是多方面的,在那里,由于不同的动机或不一定相关的自我特征,它连接到不同的社区。

For example, in a movie recommender system, a user may love comedies or horror movies simultaneously, but it is not likely that
these two types of movies are mutually close in the embedding space, nor the user embedding vector could be sufficiently close to them at the same time.

例如,在电影推荐系统中,用户可能同时喜欢喜剧或恐怖电影,但不太可能这两种类型的电影在嵌入空间上是相互接近的,因此用户嵌入向量不能同时足够接近它们。

In this paper, we propose a polysemous embedding approach for modeling multiple facets of nodes, as motivated by the phenomenon of word polysemy in language modeling.Each facet of a node is mapped as an embedding vector, while we
also maintain association degree between each pair of node and facet. The proposed method is adaptive to various existing embedding models, without significantly complicating the optimization process. We also discuss how to engage embedding vectors of different facets for inference tasks including classification and link prediction. Experiments on real-world datasets help comprehensively evaluate the performance of the proposed method.

本文针对语言中存在的词多义现象,提出了一种多义嵌入方法来模拟节点的多个方面每个人都是模特节点的面被映射为嵌入向量,而我们同时保持每对节点和方面之间的关联度。该方法适用于现有的各种嵌入模型,且不会使优化过程复杂化。我们还讨论了如何使用不同方面的嵌入向量进行推理任务,包括分类和链接预测。在实际数据集上的实验有助于综合评价该方法的性能。

框架

【KDD 2019】Is a Single Vector Enough Exploring Node Polysemy for Network Embedding