没有合适的资源?快使用搜索试试~ 我知道了~
首页A Comprehensive Survey on Graph Neural Network(2019).pdf
A Comprehensive Survey on Graph Neural Network(2019).pdf
需积分: 45 926 浏览量
更新于2023-03-16
评论 1
收藏 1.81MB PDF 举报
A Comprehensive Survey on Graph Neural Network(2019).pdf
资源详情
资源评论
资源推荐

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. X, NO. X, DECEMBER 2018 1
A Comprehensive Survey on Graph Neural
Networks
Zonghan Wu, Shirui Pan, Member, IEEE, Fengwen Chen, Guodong Long,
Chengqi Zhang, Senior Member, IEEE, Philip S. Yu, Fellow, IEEE
Abstract—Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video
processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the
Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and
are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has
imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning
approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in
data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into different
categories. With a focus on graph convolutional networks, we review alternative architectures that have recently been developed; these
learning paradigms include graph attention networks, graph autoencoders, graph generative networks, and graph spatial-temporal
networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes
and benchmarks of the existing algorithms on different learning tasks. Finally, we propose potential research directions in this
fast-growing field.
Index Terms—Deep Learning, graph neural networks, graph convolutional networks, graph representation learning, graph
autoencoder, network embedding
F
1 INTRODUCTION
T
HE recent success of neural networks has boosted re-
search on pattern recognition and data mining. Many
machine learning tasks such as object detection [1], [2], ma-
chine translation [3], [4], and speech recognition [5], which
once heavily relied on handcrafted feature engineering to
extract informative feature sets, has recently been revolu-
tionized by various end-to-end deep learning paradigms,
i.e., convolutional neural networks (CNNs) [6], long short-
term memory (LSTM) [7], and autoencoders. The success
of deep learning in many domains is partially attributed to
the rapidly developing computational resources (e.g., GPU)
and the availability of large training data, and is partially
due to the effectiveness of deep learning to extract latent
representation from Euclidean data (e.g., images, text, and
video). Taking image analysis as an example, an image can
be represented as a regular grid in the Euclidean space.
A convolutional neural network (CNN) is able to exploit
the shift-invariance, local connectivity, and compositionality
of image data [8], and as a result, CNN can extract local
• Z. Wu, F. Chen, G. Long, C. Zhang are with Centre for
Artificial Intelligence, FEIT, University of Technology Sydney,
NSW 2007, Australia (E-mail: zonghan.wu-3@student.uts.edu.au;
fengwen.chen@student.uts.edu.au; guodong.long@uts.edu.au;
chengqi.zhang@uts.edu.au).
• S. Pan is with Faculty of Information Technology, Monash University,
Clayton, VIC 3800, Australia (Email: shirui.pan@monash.edu).
• P. S. Yu is with Department of Computer Science, University of Illinois
at Chicago, Chicago, IL 60607-7053, USA (Email: psyu@uic.edu)
• Corresponding author: Shirui Pan.
Manuscript received Dec xx, 2018; revised Dec xx, 201x.
meaningful features that are shared with the entire datasets
for various image analysis tasks.
While deep learning has achieved great success on Eu-
clidean data, there is an increasing number of applications
where data are generated from the non-Euclidean domain
and need to be effectively analyzed. For instance, in e-
commerce, a graph-based learning system is able to exploit
the interactions between users and products [9], [10], [11]
to make highly accurate recommendations. In chemistry,
molecules are modeled as graphs and their bioactivity needs
to be identified for drug discovery [12], [13]. In a citation
network, papers are linked to each other via citations and
they need to be categorized into different groups [14],
[15]. The complexity of graph data has imposed significant
challenges on existing machine learning algorithms. This is
because graph data are irregular. Each graph has a variable
size of unordered nodes and each node in a graph has
a different number of neighbors, causing some important
operations (e.g., convolutions), which are easy to compute
in the image domain but are not directly applicable to the
graph domain anymore. Furthermore, a core assumption of
existing machine learning algorithms is that instances are
independent of each other. However, this is not the case for
graph data where each instance (node) is related to others
(neighbors) via some complex linkage information, which is
used to capture the interdependence among data, including
citations, friendship, and interactions.
Recently, there is increasing interest in extending deep
learning approaches for graph data. Driven by the success
of deep learning, researchers have borrowed ideas from
convolution networks, recurrent networks, and deep auto-
encoders to design the architecture of graph neural net-
arXiv:1901.00596v2 [cs.LG] 10 Mar 2019

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. X, NO. X, DECEMBER 2018 2
works. To handle the complexity of graph data, new gen-
eralizations and definitions for important operations have
been rapidly developed over the past few years. For in-
stance, Figure 1 illustrates how a kind of graph convolution
is inspired by a standard 2D convolution. This survey aims
to provide a comprehensive overview of these methods, for
both interested researchers who want to enter this rapidly
developing field and experts who would like to compare
graph neural network algorithms.
A Brief History of Graph Neural Networks The nota-
tion of graph neural networks was firstly outlined in Gori
et al. (2005) [16], and further elaborated in Micheli (2009)
[17] and Scarselli et al. (2009) [18]. These early studies learn
a target node’s representation by propagating neighbor in-
formation via recurrent neural architectures in an iterative
manner until a stable fixed point is reached. This process
is computationally expensive, and recently there have been
increasing efforts to overcome these challenges [19], [20]. In
our survey, we generalize the term graph neural networks to
represent all deep learning approaches for graph data.
Inspired by the huge success of convolutional networks
in the computer vision domain, a large number of methods
that re-define the notation of convolution for graph data have
emerged recently. These approaches are under the umbrella
of graph convolutional networks (GCNs). The first promi-
nent research on GCNs is presented in Bruna et al. (2013),
which develops a variant of graph convolution based on
spectral graph theory [21]. Since that time, there have been
increasing improvements, extensions, and approximations
on spectral-based graph convolutional networks [12], [14],
[22], [23], [24]. As spectral methods usually handle the
whole graph simultaneously and are difficult to parallel
or scale to large graphs, spatial-based graph convolutional
networks have rapidly developed recently [25], [26], [27],
[28]. These methods directly perform the convolution in the
graph domain by aggregating the neighbor nodes’ informa-
tion. Together with sampling strategies, the computation can
be performed in a batch of nodes instead of the whole graph
[25], [28], which has the potential to improve efficiency.
In addition to graph convolutional networks, many alter-
native graph neural networks have been developed in the
past few years. These approaches include graph attention
networks, graph autoencoders, graph generative networks,
and graph spatial-temporal networks. Details on the catego-
rization of these methods are given in Section 3.
Related surveys on graph neural networks. There are
a limited number of existing reviews on the topic of graph
neural networks. Using the notation geometric deep learning,
Bronstein et al. [8] give an overview of deep learning
methods in the non-Euclidean domain, including graphs
and manifolds. While being the first review on graph con-
volution networks, this survey misses several important
spatial-based approaches, including [15], [20], [25], [27],
[28], [29], which update state-of-the-art benchmarks. Fur-
thermore, this survey does not cover many newly devel-
oped architectures which are equally important to graph
convolutional networks. These learning paradigms, includ-
ing graph attention networks, graph autoencoders, graph
generative networks, and graph spatial-temporal networks,
are comprehensively reviewed in this article. Battaglia et
(a) 2D Convolution. Analo-
gous to a graph, each pixel
in an image is taken as a
node where neighbors are de-
termined by the filter size.
The 2D convolution takes a
weighted average of pixel val-
ues of the red node along with
its neighbors. The neighbors of
a node are ordered and have a
fixed size.
(b) Graph Convolution. To get
a hidden representation of the
red node, one simple solution
of graph convolution opera-
tion takes the average value
of node features of the red
node along with its neighbors.
Different from image data, the
neighbors of a node are un-
ordered and variable in size.
Fig. 1: 2D Convolution vs. Graph Convolution.
al. [30] position graph networks as the building blocks for
learning from relational data, reviewing part of graph neu-
ral networks under a unified framework. However, their
generalized framework is highly abstract, losing insights on
each method from its original paper. Lee et al. [31] conduct
a partial survey on the graph attention model, which is
one type of graph neural network. Most recently, Zhang et
al. [32] present a most up-to-date survey on deep learning
for graphs, missing those studies on graph generative and
spatial-temporal networks. In summary, none of the existing
surveys provide a comprehensive overview of graph neural
networks, only covering some of the graph convolution
neural networks and examining a limited number of works,
thereby missing the most recent development of alternative
graph neural networks, such as graph generative networks
and graph spatial-temporal networks.
Graph neural networks vs. network embedding The
research on graph neural networks is closely related to
graph embedding or network embedding, another topic
which attracts increasing attention from both the data min-
ing and machine learning communities [33] [34] [35] [36],
[37], [38]. Network embedding aims to represent network
vertices into a low-dimensional vector space, by preserving
both network topology structure and node content informa-
tion, so that any subsequent graph analytics tasks such as
classification, clustering, and recommendation can be easily
performed by using simple off-the-shelf machine learning
algorithm (e.g., support vector machines for classification).
Many network embedding algorithms are typically unsu-
pervised algorithms and they can be broadly classified into
three groups [33], i.e., matrix factorization [39], [40], ran-
dom walks [41], and deep learning approaches. The deep
learning approaches for network embedding at the same
time belong to graph neural networks, which include graph
autoencoder-based algorithms (e.g., DNGR [42] and SDNE
[43]) and graph convolution neural networks with unsuper-
vised training(e.g., GraphSage [25]). Figure 2 describes the
differences between network embedding and graph neural

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. X, NO. X, DECEMBER 2018 3
Fig. 2: Network Embedding v.s. Graph Neural Networks.
networks in this paper.
Our Contributions Our paper makes notable contribu-
tions summarized as follows:
• New taxonomy In light of the increasing number of
studies on deep learning for graph data, we propose
a new taxonomy of graph neural networks (GNNs).
In this taxonomy, GNNs are categorized into five
groups: graph convolution networks, graph atten-
tion networks, graph auto-encoders, graph genera-
tive networks, and graph spatial-temporal networks.
We pinpoint the differences between graph neural
networks and network embedding and draw the
connections between different graph neural network
architectures.
• Comprehensive review This survey provides the
most comprehensive overview of modern deep
learning techniques for graph data. For each type
of graph neural network, we provide detailed de-
scriptions on representative algorithms, and make
a necessary comparison and summarise the corre-
sponding algorithms.
• Abundant resources This survey provides abundant
resources on graph neural networks, which include
state-of-the-art algorithms, benchmark datasets,
open-source codes, and practical applications. This
survey can be used as a hands-on guide for under-
standing, using, and developing different deep learn-
ing approaches for various real-life applications.
• Future directions This survey also highlights the cur-
rent limitations of the existing algorithms, and points
out possible directions in this rapidly developing
field.
Organization of Our Survey The rest of this survey
is organized as follows. Section 2 defines a list of graph-
related concepts. Section 3 clarifies the categorization of
graph neural networks. Section 4 and Section 5 provides
an overview of graph neural network models. Section 6
presents a gallery of applications across various domains.
Section 7 discusses the current challenges and suggests
future directions. Section 8 summarizes the paper.
TABLE 1: Commonly used notations.
Notations Descriptions
| · | The length of a set
Element-wise product.
A
T
Transpose of vector/matrix A.
[A, B] Concatenation of A and B.
G A graph
V The set of nodes in a graph
v
i
A node v
i
∈ V
N(v) The neighbors of node v
E The set of edges in a graph
e
ij
An edge e
ij
∈ E
X ∈ R
N×D
The feature matrix of a graph.
x ∈ R
N
The feature vector of a graph in the case of D = 1.
X
i
∈ R
D
The feature vector of the node v
i
.
N The number of nodes, N = |V|.
M The number of edges, M = |E|.
D The dimension of a node vector.
T The total number of time steps in time series.
Fig. 3: Categorization of Graph Neural Networks.
2 DEFINITION
In this section, we provide definitions of basic graph con-
cepts. For easy retrieval, we summarize the commonly used
notations in Table 1.
Definition 1 (Graph). A Graph is G = (V, E, A) where V
is the set of nodes, E is the set of edges, and A is the
adjacency matrix. In a graph, let v
i
∈ V to denote a node
and e
ij
= (v
i
, v
j
) ∈ E to denote an edge. The adjacency
matrix A is a N × N matrix with A
ij
= w
ij
> 0 if
e
ij
∈ E and A
ij
= 0 if e
ij
/∈ E. The degree of a node is
the number of edges connected to it.
A graph can be associated with node attributes X
1
,
where X ∈ R
N×D
is a feature matrix with X
i
∈ R
D
representing the feature vector of node v
i
. In the case of
D = 1, we replace x ∈ R
N
with X to denote the feature
vector of the graph.
Definition 2 (Directed Graph). A directed graph is a graph
with all edges pointing from one node to another. For
a directed graph, A
ij
6= A
ji
. An undirected graph is
1. Such graph is referred to an attributed graph in literature.

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. X, NO. X, DECEMBER 2018 4
Fig. 4: A Variant of Graph Convolution Networks with Mul-
tiple GCN layers [14]. A GCN layer encapsulates each node’s
hidden representation by aggregating feature information from
its neighbors. After feature aggregation, a non-linear transfor-
mation is applied to the resultant outputs. By stacking multiple
layers, the final hidden representation of each node receives
messages from a further neighborhood.
a graph with all edges undirected. For an undirected
graph, A
ij
= A
ji
.
Definition 3 (Spatial-Temporal Graph). A spatial-temporal
graph is an attributed graph where the feature matrix X
evolves over time. It is defined as G = (V, E, A, X) with
X ∈ R
T ×N×D
where T is the length of time steps.
3 CATEGORIZATION AND FRAMEWORKS
In this section, we present our taxonomy of graph neural
networks. We consider any differentiable graph models
which incorporate neural architectures as graph neural net-
works. We categorize graph neural networks into graph con-
volution networks, graph attention networks, graph auto-
encoders, graph generative networks and graph spatial-
temporal networks. Of these, graph convolution networks
play a central role in capturing structural dependencies.
As illustrated in Figure 3, methods under other categories
partially utilize graph convolution networks as building
blocks. We summarize the representative methods in each
category in Table 2, and we give a brief introduction of each
category in the following.
3.1 Taxonomy of GNNs
Graph Convolution Networks (GCNs) generalize the oper-
ation of convolution from traditional data (images or grids)
to graph data. The key is to learn a function f to generate
a node v
i
’s representation by aggregating its own features
X
i
and neighbors’ features X
j
, where j ∈ N (v
i
). Figure 4
shows the process of GCNs for node representation learn-
ing. Graph convolutional networks play a central role in
building up many other complex graph neural network
models, including auto-encoder-based models, generative
models, and spatial-temporal networks, etc. Figure 5 illus-
trates several graph neural network models building on
GCNs.
Graph Attention Networks are similar to GCNs and seek an
aggregation function to fuse the neighboring nodes, random
walks, and candidate models in graphs to learn a new
representation. The key difference is that graph attention
networks employ attention mechanisms which assign larger
weights to the more important nodes, walks, or models. The
attention weight is learned together with neural network
(a) Graph Convolution Networks with Pooling Modules for Graph
Classification [12]. A GCN layer [14] is followed by a pooling layer to
coarsen a graph into sub-graphs so that node representations on coars-
ened graphs represent higher graph-level representations. To calculate
the probability for each graph label, the output layer is a linear layer
with the SoftMax function.
(b) Graph Auto-encoder with GCN [62]. The encoder uses GCN layers
to get latent rerpesentations for each node. The decoder computes the
pair-wise distance between node latent representations produced by the
encoder. After applying a non-linear activation function, the decoder
reconstructs the graph adjacency matrix.
(c) Graph Spatial-Temporal Networks with GCN [74]. A GCN layer is
followed by a 1D-CNN layer. The GCN layer operates on A
t
and X
t
to capture spatial dependency, while the 1D-CNN layer slides over X
along the time axis to capture the temporal dependency. The output
layer is a linear transformation, generating a prediction for each node.
Fig. 5: Different Graph Neural Network Models built with
GCNs.
parameters within an end-to-end framework. Figure 6 illus-
trates the difference between graph convolutional networks
and graph attention networks in aggregating the neighbor
node information.
Graph Auto-encoders are unsupervised learning frame-
works which aim to learn low dimensional node vectors
via an encoder, and then reconstruct the graph data via
a decoder. Graph autoencoders are a popular approach to
learn the graph embedding, for both plain graphs with-
out attributed information [42], [43] as well as attributed
graphs [64], [65]. For plain graphs, many algorithms directly
prepossess the adjacency matrix, by either constructing a
new matrix (i.e., pointwise mutual information matrix) with
rich information [42] or feeding the adjacency matrix to
an autoencoder model and capturing both first order and
second order information [43]. For attributed graphs, graph
autoencoder models tend to employ GCN [14] as a building
block for the encoder and reconstruct the structure informa-
tion via a link prediction decoder [62], [64].

JOURNAL OF L
A
T
E
X CLASS FILES, VOL. X, NO. X, DECEMBER 2018 5
TABLE 2: Representative Publications of Graph Neural Networks
Category Publications
Graph Convolution Networks
Spectral-based [12], [14], [21], [22], [23], [24], [44], [45], [46]
Spatial-based
[13], [16], [17], [18], [19], [20], [25], [26], [27], [28], [47]
[48], [49], [50], [51], [52], [53], [54], [55], [56], [57]
Pooling modules [12], [22], [58], [59]
Graph Attention Networks [15], [29], [60], [61]
Graph Auto-encoder [42], [43], [62], [63], [64], [65], [66]
Graph Generative Networks [67], [68], [69], [70], [71]
Graph Spatial-Temporal Networks [72], [73], [74], [75], [76]
(a) Graph Convolution Net-
works [14] explicitly assign a
non-parametric weight a
ij
=
1
√
deg(v
i
)deg(v
j
)
to the neigh-
bor v
j
of v
i
during the aggre-
gation process.
(b) Graph Attention Networks
[15] implicitly capture the
weight a
ij
via an end-to-end
neural network architecture,
so that more important nodes
receive larger weights.
Fig. 6: Differences between graph convolutional networks
and graph attention networks.
Graph Generative Networks aim to generate plausible
structures from data. Generating graphs given a graph
empirical distribution is fundamentally challenging, mainly
because graphs are complex data structures. To address this
problem, researchers have explored to factor the generation
process as forming nodes and edges alternatively [67], [68],
to employ generative adversarial training [69], [70]. One
promising application domain of graph generative networks
is chemical compound synthesis. In a chemical graph, atoms
are treated as nodes and chemical bonds are treated as
edges. The task is to discover new synthesizable molecules
which possess certain chemical and physical properties.
Graph Spatial-temporal Networks aim to learn unseen pat-
terns from spatial-temporal graphs, which are increasingly
important in many applications such as traffic forecasting
and human activity prediction. For instance, the underlying
road traffic network is a natural graph where each key loca-
tion is a node whose traffic data is continuously monitored.
By developing effective graph spatial-temporal network
models, we can accurately predict the traffic status over
the whole traffic system [73], [74]. The key idea of graph
spatial-temporal networks is to consider spatial dependency
and temporal dependency at the same time. Many current
approaches apply GCNs to capture the dependency together
with some RNN [73] or CNN [74] to model the temporal
dependency.
3.2 Frameworks
Graph neural networks, graph convolution networks
(GCNs) in particular, try to replicate the success of CNN
in graph data by defining graph convolutions via graph
spectral theory or spatial locality. With the graph structure
and node content information as inputs, the outputs of GCN
can focus on different graph analytics task with one of the
following mechanisms:
• Node-level outputs relate to node regression and
classification tasks. As a graph convolution module
directly gives nodes’ latent representations, a multi-
perceptron layer or softmax layer is used as the final
layer of GCN. We review graph convolution modules
in Section 4.1 and Section 4.2.
• Edge-level outputs relate to the edge classifica-
tion and link prediction tasks. To predict the la-
bel/connection strength of an edge, an additional
function will take two nodes’ latent representations
from the graph convolution module as inputs.
• Graph-level outputs relate to the graph classification
task. To obtain a compact representation on graph
level, a pooling module is used to coarse a graph into
sub-graphs or to sum/average over the node repre-
sentations. We review the graph pooling module in
Section 4.3.
In Table 3, we list the details of the inputs and outputs
of the main GCNs methods. In particular, we summarize
output mechanisms in between each GCN layer and in the
final layer of each method. The output mechanisms may
involve several pooling operations, which are discussed in
Section 4.3.
End-to-end Training Frameworks. Graph convolutional net-
works can be trained in a (semi-) supervised or purely un-
supervised way within an end-to-end learning framework,
depending on the learning tasks and label information avail-
able at hand.
• Semi-supervised learning for node-level classifi-
cation. Given a single network with partial nodes
being labeled and others remaining unlabeled, graph
convolutional networks can learn a robust model that
effectively identify the class labels for the unlabeled
nodes [14]. To this end, an end-to-end framework can
be built by stacking a couple of graph convolutional
layers followed by a softmax layer for multi-class
classification.
• Supervised learning for graph-level classification.
Given a graph dataset, graph-level classification aims
to predict the class label(s) for an entire graph [58],
剩余21页未读,继续阅读
















安全验证
文档复制为VIP权益,开通VIP直接复制

评论0