Metapath2Vec：异质网络的可扩展表示学习

版权申诉

160 浏览量更新于2024-08-11 收藏 1.29MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

"metapath2vec是一种针对异构网络的可扩展表示学习方法，旨在克服传统网络嵌入技术在处理包含多种节点类型和链接的网络时面临的限制。该方法由Yuxiao Dong等人提出，涉及到Microsoft Research、University of Notre Dame和Army Research Laboratory的研究成果。" 【metapath2vec介绍】在异构网络中，节点和边可以有多种不同的类型，这给传统的同质网络嵌入技术带来了挑战。metapath2vec是为了解决这个问题而设计的，它提供了一种可扩展的表示学习模型，能够有效地处理复杂网络结构。该模型的核心思想是利用元路径（meta-paths）指导的随机游走来构建节点的异构邻域，并基于此采用异构的skip-gram模型进行节点嵌入。【元路径与随机游走】元路径是连接不同节点类型的路径，例如A-B-A表示从节点A到节点B再到节点A的路径，其中A和B可能是网络中不同类型的节点。在metapath2vec中，这些元路径用于生成节点的上下文，以捕捉网络中的结构信息。通过元路径引导的随机游走，模型能够更好地理解不同类型节点之间的关系。【异构skip-gram模型】传统的skip-gram模型是Word2Vec的一种变体，主要关注词与上下文词的关系。在metapath2vec中，它被扩展为处理异构网络中的节点关系。异构skip-gram模型的目标是预测给定节点的邻域节点，同时考虑到网络的异构特性，这使得嵌入向量不仅能够反映节点的局部结构，还能捕获网络的语义信息。【metapath2vec++的改进】 metapath2vec++是在metapath2vec基础上的进一步增强，它引入了对异构网络中结构和语义相关性的同时建模。这允许模型在保持对结构信息敏感的同时，也能捕捉到网络中的语义关联，从而提高嵌入的质量和泛化能力。【实验结果】大量的实验表明，metapath2vec和metapath2vec++在各种任务上均优于现有的嵌入模型，证明了其在异构网络表示学习上的优越性。它们在节点分类、链接预测等任务上的性能提升，证实了这种方法的有效性，为处理复杂网络数据提供了强大工具。 metapath2vec系列模型是解决异构网络表示学习问题的重要突破，它们通过结合元路径和skip-gram模型，实现了对异构网络结构和语义的深度理解，为后续的网络分析和应用提供了高质量的节点嵌入向量。

资源详情

资源推荐

metapath2vec: Scalable Representation Learning for

Heterogeneous Networks

Yuxiao Dong

∗

Microso Research

Redmond, WA 98052

yuxdong@microso.com

Nitesh V. Chawla

University of Notre Dame

Notre Dame, IN 46556

nchawla@nd.edu

Ananthram Swami

Army Research Laboratory

Adelphi, MD 20783

ananthram.swami.civ@mail.mil

ABSTRACT

We study the problem of representation learning in heterogeneous

networks. Its unique challenges come from the existence of mul-

tiple types of nodes and links, which limit the feasibility of the

conventional network embedding techniques. We develop two

scalable representation learning models, namely metapath2vec and

metapath2vec++. e metapath2vec model formalizes meta-path-

based random walks to construct the heterogeneous neighborhood

of a node and then leverages a heterogeneous skip-gram model

to perform node embeddings. e metapath2vec++ model further

enables the simultaneous modeling of structural and semantic cor-

relations in heterogeneous networks. Extensive experiments show

that metapath2vec and metapath2vec++ are able to not only outper-

form state-of-the-art embedding models in various heterogeneous

network mining tasks, such as node classication, clustering, and

similarity search, but also discern the structural and semantic cor-

relations between diverse network objects.

CCS CONCEPTS

•Information systems →Social networks; •Computing method-

ologies →Unsupervised learning; Learning latent represen-

tations; Knowledge representation and reasoning;

KEYWORDS

Network Embedding; Heterogeneous Representation Learning; La-

tent Representations; Feature Learning; Heterogeneous Information

Networks

ACM Reference format:

Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. metap-

ath2vec: Scalable Representation Learning for Heterogeneous Networks. In

Proceedings of KDD ’17, August 13-17, 2017, Halifax, NS, Canada, , 10 pages.

DOI: hp://dx.doi.org/10.1145/3097983.3098036

1 INTRODUCTION

Neural network-based learning models can represent latent embed-

dings that capture the internal relations of rich, complex data across

various modalities, such as image, audio, and language [

]. Social

∗

is work was done when Yuxiao was a Ph.D. student at University of Notre Dame.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permied. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

KDD ’17, August 13-17, 2017, Halifax, NS, Canada

DOI: hp://dx.doi.org/10.1145/3097983.3098036

S. Shenker

M. I.Jordan

J. Han

A. Tomkins

R. E. Tarjan

D. Song

J. Dean

T. Kanade

R. N. Taylor

C. D. Manning

H. Ishii

H. Jensen

R. Agrawal

J. Malik

O. Mutlu

KDD

SIGGRAPH

SIGIR

FOCS

S&P

OSDI

NIPS

IJCAI

ICSE

SIGCOMM

ACL

SIGMOD

CHI

CVPR

WWW

ISCA

W. B. Croft

(a) DeepWalk / node2vec

S. Shenker

M. I.Jordan

J. Han

A. Tomkins

R. E. Tarjan

D. Song

J. Dean

T. Kanade

R. N. Taylor

C. D. Manning

H. Ishii

H. Jensen

R. Agrawal

J. Malik

O. Mutlu

KDD

SIGGRAPH

SIGIR

FOCS

S&P

OSDI

NIPS

IJCAI

ICSE

SIGCOMM

ACL

SIGMOD

CHI

CVPR

WWW

ISCA

W. B. Croft

(b) PTE

S. Shenker

M. I.Jordan

J. Han

A. Tomkins

R. E. Tarjan

D. Song

J. Dean

T. Kanade

R. N. Taylor

C. D. Manning

H. Ishii

H. Jensen

R. Agrawal

J. Malik

O. Mutlu

KDD

SIGGRAPH

SIGIR

FOCS

S&P

OSDI

NIPS

IJCAI

ICSE

SIGCOMM

ACL

SIGMOD

CHI

CVPR

WWW

ISCA

W. B. Croft

S. Shenker

M. I.Jordan

J. Han

A. Tomkins

R. E. Tarjan

D. Song

J. Dean

T. Kanade

R. N. Taylor

C. D. Manning

H. Ishii

H. Jensen

R. Agrawal

J. Malik

O. Mutlu

KDD

SIGGRAPH

SIGIR

FOCS

S&P

OSDI

NIPS

IJCAI

ICSE

SIGCOMM

ACL

SIGMOD

CHI

CVPR

WWW

ISCA

W. B. Croft

(d) metapath2vec++

Figure 1: 2D PCA projections of the 128D embeddings of 16

top CS conferences and corresponding high-prole authors.

and information networks are similarly rich and complex data that

encode the dynamics and types of human interactions, and are sim-

ilarly amenable to representation learning using neural networks.

In particular, by mapping the way that people choose friends and

maintain connections as a “social language,” recent advances in

natural language processing (NLP) [

] can be naturally applied to

network representation learning, most notably the group of NLP

models known as word2vec [

]. A number of recent research

publications have proposed word2vec-based network representa-

tion learning frameworks, such as DeepWalk [

], LINE [

], and

node2vec [

]. Instead of handcraed network feature design, these

representation learning methods enable the automatic discovery of

useful and meaningful (latent) features from the “raw networks.”

However, these work has thus far focused on representation

learning for homogeneous networks—representative of singular

type of nodes and relationships. Yet a large number of social and

information networks are heterogeneous in nature, involving diver-

sity of node types and/or relationships between nodes [

]. ese

heterogeneous networks present unique challenges that cannot

be handled by representation learning models that are specically

designed for homogeneous networks. Take, for example, a het-

erogeneous academic network: How do we eectively preserve

the concept of “word-context” among multiple types of nodes, e.g.,

authors, papers, venues, organizations, etc.? Can random walks,

such those used in DeepWalk and node2vec, be applied to networks

KDD 2017 Research Paper

KDD’17, August 13–17, 2017, Halifax, NS, Canada

135

下载后可阅读完整内容，剩余9页未读，立即下载

甜辣uu

粉丝: 9166
资源: 1103

Metapath2Vec：异质网络的可扩展表示学习

word2vec.rar_VEC-361_layers5cb_vec361_word2vec_word2vec 中文

Wav2Vec2模型文件

Word2Vec算法详解

使用Python实现Word2Vec模型

使用Gensim库快速实现Word2Vec

将Doc2Vec引入文本相似度计算

Word2Vec简介及基本原理解析

doc2vec和word2vec区别

在word2vec中KeyedVectors.load_word2vec_format与Word2Vec.load的区别是什么

keras word2vec doc2vec 实现代码

Wav2Vec2ForCTC下载

glove2word2vec什么意思

keras实现word2vec和doc2vec

doc2vec 相较于 word2vec 的优点和缺点

java word2vec

安装node2vec

wav2vec2详解

pyg node2vec

node2vec安装

最新资源