PGE：图表示学习框架在属性图中的应用

需积分: 11 42 浏览量更新于2024-09-02 收藏 1.11MB PDF 举报

"KDD2019_A Representation Learning Framework for Property Graphs.pdf" 这篇论文"Representation Learning Framework for Property Graphs"由Yifan Hou、Hongzhi Chen、Changji Li、James Cheng、Ming-Chang Yang和Fan Yu共同撰写，发表于KDD2019，探讨了在属性图（property graphs）上进行表示学习的方法，即PGE（Property Graph Embedding）。属性图是一种数据结构，其中的节点和边不仅包含拓扑关系，还拥有丰富的属性信息。表示学习（representation learning）在图上的应用，通常称为图嵌入（graph embedding），已经在诸如分类、预测和推荐等机器学习任务中显示出显著效果。然而，现有的工作往往忽视了现代应用中图节点和边属性的丰富信息。大多数图嵌入方法要么关注只有拓扑结构的简单图，要么仅考虑节点属性。 PGE框架的创新之处在于它将节点和边的属性都融入到图嵌入过程中。通过使用节点聚类，PGE能够捕捉节点间的相似性和差异性，同时考虑其属性信息。这种结合属性的嵌入方法有助于更好地理解和捕获图结构中的复杂模式和关系。此外，PGE采用了inductive模型进行邻居聚集，这是一种能够泛化的模型，可以处理未见过的节点，增强了模型的适用性。论文通过实验对PGE的有效性进行了详尽分析，这些实验包括节点分类和链接预测等基准任务。在实际数据集上，PGE与最新的图嵌入方法相比，展示了更优的嵌入结果，这证明了PGE在处理属性图时的优越性能。这些结果对于推动图表示学习领域的发展，特别是在处理具有丰富属性信息的图数据时，具有重要意义。 PGE提供了一种新的框架，使得图嵌入能够充分利用属性信息，提高了图学习任务的准确性和效率。这种方法不仅扩展了图表示学习的理论基础，也为处理现实世界中复杂属性图数据的算法设计提供了新的思路。

A Representation Learning Framework for Property Graphs

Yifan Hou

, Hongzhi Chen

, Changji Li

, James Cheng

, Ming-Chang Yang

, Fan Yu

Department of Computer Science and Engineering

The Chinese University of Hong Kong

{yfhou,hzchen,cjli,jcheng,mcyang}@cse.cuhk.edu.hk

Distributed and Parallel Software Lab

Central Software Institute, 2012 Lab, Huawei Technologies Co. Ltd.

fan.yu@huawei.com

ABSTRACT

Representation learning on graphs, also called graph embedding,

has demonstrated its signicant impact on a series of machine

learning applications such as classication, prediction and recom-

mendation. However, existing work has largely ignored the rich

information contained in the properties (or attributes) of both nodes

and edges of graphs in modern applications, e.g., those represented

by property graphs. To date, most existing graph embedding meth-

ods either focus on plain graphs with only the graph topology, or

consider properties on nodes only. We propose PGE, a graph repre-

sentation learning framework that incorporates both node and edge

properties into the graph embedding procedure. PGE uses node

clustering to assign biases to dierentiate neighbors of a node and

leverages multiple data-driven matrices to aggregate the property

information of neighbors sampled based on a biased strategy. PGE

adopts the popular inductive model for neighborhood aggregation.

We provide detailed analyses on the ecacy of our method and

validate the performance of PGE by showing how PGE achieves

better embedding results than the state-of-the-art graph embedding

methods on benchmark applications such as node classication and

link prediction over real-world datasets.

KEYWORDS

graph neural networks, graph embedding, property graphs, repre-

sentation learning

ACM Reference Format:

Yifan Hou

, Hongzhi Chen

, Changji Li

, James Cheng

, Ming-Chang Yang

Fan Yu

. 2019. A Representation Learning Framework for Property Graphs.

In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data

Mining (KDD ’19), August 4–8, 2019, Anchorage, AK, USA. ACM, New York,

NY, USA, 9 pages. https://doi.org/10.1145/3292500.3330948

1 INTRODUCTION

Graphs are ubiquitous today due to the exibility of using graphs to

model data in a wide spectrum of applications. In recent years, more

and more machine learning applications conduct classication or

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

KDD ’19, August 4–8, 2019, Anchorage, AK, USA

ACM ISBN 978-1-4503-6201-6/19/08.. . $15.00

https://doi.org/10.1145/3292500.3330948

prediction based on graph data [

], such as classifying

protein’s functions in biological graphs, understanding the rela-

tionship between users in online social networks, and predicting

purchase patterns in buyers-products-sellers graphs in online e-

commerce platforms. However, it is not easy to directly make use

of the structural information of graphs in these applications as

graph data are high-dimensional and non-Euclidean. On the other

hand, considering only graph statistics such as degrees [

], kernel

functions [

], or local neighborhood structures [

] often provides

limited information and hence aects the accuracy of classica-

tion/prediction.

Representation learning methods [

] attempt to solve the above-

mentioned problem by constructing an embedding for each node

in a graph, i.e., a mapping from a node to a low-dimensional Eu-

clidean space as vectors, which uses geometric metrics (e.g., Eu-

clidean distance) in the embedding space to represent the struc-

tural information. Such graph embeddings [

] have achieved

good performance for classication/prediction on plain graphs (i.e.,

graphs with only the pure topology, without node/edge labels and

properties). However, in practice, most graphs in real-world do not

only contain the topology information, but also contain labels and

properties (also called attributes) on the entities (i.e., nodes) and

relationships (i.e., edges). For example, in the companies that we

collaborate with, most of their graphs (e.g., various graphs related

to products, buyers and sellers from an online e-commerce platform;

mobile phone call networks and other communication networks

from a service provider) contain rich node properties (e.g., user pro-

le, product details) and edge properties (e.g., transaction records,

phone call details). We call such graphs as

property graphs

. Ex-

isting methods [

] have not considered to

take the rich information carried by both nodes and edges into the

graph embedding procedure.

This paper studies the problem of property graph embedding.

There are two main challenges. First, each node

may have many

properties and it is hard to nd which properties may have greater

inuence on

for a specic application. For example, consider the

classication of papers into dierent topics for a citation graph

where nodes represent papers and edges model citation relation-

ships. Suppose that each node has two properties, “year” and “title”.

Apparently, the property “title” is likely to be more important for

paper classication than the property “year”. Thus, how to measure

the inuence of the properties on each node for dierent applica-

tions needs to be considered. Second, for each node

, its neighbors,

as well as the connecting edges, may have dierent properties. How

to measure the inuences of both the neighbors and the connecting

edges on

for dierent application poses another challenge. In the

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38313113

粉丝: 7
资源: 4

PGE：图表示学习框架在属性图中的应用

KDD+2019+—+Deep+Learning+for+NLP+with+TensorFlow.pdf

GCN_linkprediction:在pytorch上使用GCN进行链接预测

图卷积网络Graph Convolutional Network（GCN）的理解和详细推导.pdf

AI in Transportation_KDD2018_Tutorial_final.pdf

kddcup.data_10_percent.zip_KDD训练法_kddcup matlab_kdd训练_site:www.p

KDD2019_K-Multiple-Means:论文“ K-多重均值”的实现

M2GRL_A Multi-task Multi-view Graph Representation Learning Framework for Web-sc

kddcup-data_10_percent_corrected.rar_KDD_KDD CUP99 _PCA KDD_TWZE

KDD数据_NSL-KDD_NSL_KDD_kdd数据集_

CFTree-a-SVM.rar_SVM kdd_cftree matlab_kdd matlab_svm kdd_svm

最新资源