Cluster-GCN：大规模图卷积网络的高效训练算法

需积分: 0 167 浏览量更新于2024-08-05 收藏 1.22MB PDF 举报

"2019-KDD-Cluster-GCN, An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" 在图卷积网络（Graph Convolutional Network, GCN）领域，2019年KDD会议上发表的"Cluster-GCN"提出了一种针对大规模GCN高效训练的算法。GCN在基于图的诸多应用中表现出色，但其在处理大型图数据时面临着计算复杂度高和内存需求大的问题。传统基于随机梯度下降（SGD）的算法在处理多层GCN时，计算成本会随着层数指数级增长，同时需要存储整个图以及每个节点的嵌入向量，这在内存有限的情况下尤为困难。 Cluster-GCN算法由国立台湾大学的Wei-Lin Chiang、加州大学洛杉矶分校的Xuanqing Liu等人以及谷歌研究院的研究人员共同提出。该算法的核心思想是利用图的聚类结构来优化训练过程。具体来说，Cluster-GCN在每个训练步骤中，采样与密集子图相关联的一组节点，这个子图是由图的聚类算法识别出来的。通过这种方式，算法可以局部地更新节点的特征，减少了对全局图信息的需求，从而降低了计算成本和内存占用。 1. **图聚类**：Cluster-GCN首先对原始图进行聚类，将具有强连接关系的节点分到同一个簇中。这样做的目的是为了减少在训练过程中需要考虑的边的数量，使得在局部范围内进行图卷积运算成为可能。 2. **局部传播**：在每个训练迭代中，算法只处理每个簇内的节点及其相邻节点，而不是整个图。这大大减少了每次更新所需的数据量，使得大规模图的训练变得更加高效。 3. **采样策略**：Cluster-GCN采用了一种采样策略，选取每个簇的一部分节点进行更新，而非全部。这种采样策略既能保持模型的准确性，又能降低计算复杂度。 4. **内存效率**：通过限制在内存中加载的节点数量，Cluster-GCN解决了大规模图学习中的内存瓶颈问题，使得在有限的硬件资源下也能训练深层GCN模型。 5. **扩展性与深度**：由于其局部处理的特性，Cluster-GCN能够有效地支持更深的网络结构，而不会遇到传统GCN在层数增加时的性能下降问题。 6. **实验验证**：研究者们在多个大型图数据集上验证了Cluster-GCN的性能，结果显示，相比于其他基于SGD的GCN训练方法，Cluster-GCN在训练速度和模型性能上都有显著提升。 Cluster-GCN是一种创新的、适用于大规模图数据的GCN训练算法，它通过图聚类和局部处理降低了计算复杂度，提升了训练效率，为处理复杂图结构的问题提供了新的解决方案。

Cluster-GCN: An Eicient Algorithm for Training Deep and

Large Graph Convolutional Networks

Wei-Lin Chiang

∗

National Taiwan University

r06922166@csie.ntu.edu.tw

Xuanqing Liu

∗

University of California, Los Angeles

xqliu@cs.ucla.edu

Si Si

Google Research

sisidaisy@google.com

Yang Li

Google Research

liyang@google.com

Samy Bengio

Google Research

bengio@google.com

Cho-Jui Hsieh

University of California, Los Angeles

chohsieh@cs.ucla.edu

ABSTRACT

Graph convolutional network (GCN) has been successfully applied

to many graph-based applications; however, training a large-scale

GCN remains challenging. Current SGD-based algorithms suer

from either a high computational cost that exponentially grows

with number of GCN layers, or a large space requirement for keep-

ing the entire graph and the embedding of each node in memory. In

this paper, we propose Cluster-GCN, a novel GCN algorithm that is

suitable for SGD-based training by exploiting the graph clustering

structure. Cluster-GCN works as the following: at each step, it sam-

ples a block of nodes that associate with a dense subgraph identied

by a graph clustering algorithm, and restricts the neighborhood

search within this subgraph. This simple but eective strategy leads

to signicantly improved memory and computational eciency

while being able to achieve comparable test accuracy with previous

algorithms. To test the scalability of our algorithm, we create a

new Amazon2M data with 2 million nodes and 61 million edges

which is more than 5 times larger than the previous largest publicly

available dataset (Reddit). For training a 3-layer GCN on this data,

Cluster-GCN is faster than the previous state-of-the-art VR-GCN

(1523 seconds vs 1961 seconds) and using much less memory (2.2GB

vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our

algorithm can nish in around 36 minutes while all the existing

GCN training algorithms fail to train due to the out-of-memory

issue. Furthermore, Cluster-GCN allows us to train much deeper

GCN without much time and memory overhead, which leads to

improved prediction accuracy—using a 5-layer Cluster-GCN, we

achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while

the previous best result was 98.71 by [16].

ACM Reference Format:

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui

Hsieh. 2019. Cluster-GCN: An Ecient Algorithm for Training Deep and

∗

This work was done during the rst and the second author’s internship at Google

Research.

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for third-party components of this work must be honored.

For all other uses, contact the owner/author(s).

KDD ’19, August 4–8, 2019, Anchorage, AK, USA

ACM ISBN 978-1-4503-6201-6/19/08.

https://doi.org/10.1145/3292500.3330925

Large Graph Convolutional Networks. In The 25th ACM SIGKDD Con-

ference on Knowledge Discovery and Data Mining (KDD ’19), August 4–

8, 2019, Anchorage, AK, USA. ACM, New York, NY, USA, 10 pages. https:

//doi.org/10.1145/3292500.3330925

1 INTRODUCTION

Graph convolutional network (GCN) [

] has become increasingly

popular in addressing many graph-based applications, including

semi-supervised node classication [

], link prediction [

] and

recommender systems [

]. Given a graph, GCN uses a graph con-

volution operation to obtain node embeddings layer by layer—at

each layer, the embedding of a node is obtained by gathering the

embeddings of its neighbors, followed by one or a few layers of

linear transformations and nonlinear activations. The nal layer

embedding is then used for some end tasks. For instance, in node

classication problems, the nal layer embedding is passed to a

classier to predict node labels, and thus the parameters of GCN

can be trained in an end-to-end manner.

Since the graph convolution operator in GCN needs to propagate

embeddings using the interaction between nodes in the graph, this

makes training quite challenging. Unlike other neural networks

that the training loss can be perfectly decomposed into individual

terms on each sample, the loss term in GCN (e.g., classication

loss on a single node) depends on a huge number of other nodes,

especially when GCN goes deep. Due to the node dependence,

GCN’s training is very slow and requires lots of memory – back-

propagation needs to store all the embeddings in the computation

graph in GPU memory.

Previous GCN Training Algorithms:

To demonstrate the

need of developing a scalable GCN training algorithm, we rst

discuss the pros and cons of existing approaches, in terms of 1)

memory requirement

, 2) time per epoch

and 3) convergence

speed (loss reduction) per epoch. These three factors are crucial for

evaluating a training algorithm. Note that memory requirement

directly restricts the scalability of algorithm, and the later two

factors combined together will determine the training speed. In the

following discussion we denote

to be the number of nodes in the

graph,

the embedding dimension, and

the number of layers to

analyze classic GCN training algorithms.

•

Full-batch gradient descent is proposed in the rst GCN pa-

per [

]. To compute the full gradient, it requires storing all the

Here we consider the memory for storing node embeddings, which is dense and

usually dominates the overall memory usage for deep GCN.

An epoch means a complete data pass.

arXiv:1905.07953v1 [cs.LG] 20 May 2019

下载后可阅读完整内容，剩余9页未读，立即下载

半清斋

粉丝: 852
资源: 322

Cluster-GCN：大规模图卷积网络的高效训练算法

2019-KDD-GCN-MF, Disease-Gene Association Identification By Grap

2019-KDD-DEMO-Net Degree-specific Graph Neural Networks for Node

2019-KDD-KGAT, Knowledge Graph Attention Network for Recommendat

2019-KDD-Automating Feature Subspace Exploration via Multi-Agent

2019-KDD-Conditional Random Field Enhanced Graph Convolutional N

2019-KDD-Graph Recurrent Networks with Attributed Random Walks-网

百度地图毕业设计源码-kddcup2019-archaive:KDDCup2019AutomlTrack的实现代码、总结和经验

NSL-KDD_NSL-KDD_NSL-KDD数据集_测试集_

ids-kdd99 ids-kdd99

nsl-kdd-cup.rar_KDD cup matlab_NSL-KDD_kdd matlab_nsl kdd datas

最新资源