深入理解图卷积网络在半监督学习中的应用

需积分: 0 189 浏览量更新于2024-08-05 收藏 1.92MB PDF 举报

"深入理解图卷积网络在半监督学习中的应用" 在机器学习领域，许多有趣的问题正在通过新的深度学习工具被重新审视。图卷积网络（GCNs）是近年来在基于图的半监督学习中的一项重要发展，它巧妙地将局部顶点特征与图的拓扑结构整合到卷积层中。尽管GCN模型相比其他最先进的方法表现出色，但其工作机制并不清晰，并且仍然需要大量标注数据来验证和选择模型。本文旨在深入探讨GCN模型并解决其基本限制。首先，我们揭示GCN模型的图卷积实际上是一种拉普拉斯平滑的特殊形式，这是GCN工作效果良好的关键原因。然而，这也带来了随着卷积层数量增加可能导致的过度平滑问题。过度平滑可能使模型失去区分度，降低预测性能。为了克服GCN的这一局限性，我们提出了一种新颖的结构，旨在保持图的拓扑信息的同时，减少过度平滑的影响。我们通过引入层次化的图卷积层，允许信息在不同层次上流动，以保留节点的特性多样性。此外，我们还研究了正则化策略，如动态权重调整和跳过连接，这些策略可以防止信息丢失，增强模型的泛化能力。进一步，我们对训练过程进行了分析，探讨了如何优化学习率、批次大小和dropout策略，以提高模型的收敛速度和稳定性。我们通过实验展示了这些改进在各种图数据集上的有效性，包括社会网络、生物网络和知识图谱，结果表明我们的方法在保持或提高性能的同时，减少了对标注数据的依赖。最后，我们讨论了GCN模型的未来发展方向，包括自适应图卷积、动态图建模以及更深入的理论分析，以期推动图神经网络领域的进一步发展。本文的工作不仅提供了对GCN内在机制的深刻理解，而且为解决半监督学习中的过度平滑问题提供了实用的解决方案，为图数据的深度学习研究开辟了新的路径。

Deeper Insights into Graph Convolutional Networks

for Semi-Supervised Learning

Qimai Li

, Zhichao Han

, Xiao-Ming Wu

1∗

The Hong Kong Polytechnic University

ETH Zurich

csqmli@comp.polyu.edu.hk, zhhan@student.ethz.ch, xiao-ming.wu@polyu.edu.hk

Abstract

Many interesting problems in machine learning are being

revisited with new deep learning tools. For graph-based semi-

supervised learning, a recent important development is graph

convolutional networks (GCNs), which nicely integrate local

vertex features and graph topology in the convolutional lay-

ers. Although the GCN model compares favorably with other

state-of-the-art methods, its mechanisms are not clear and it

still requires considerable amount of labeled data for valida-

tion and model selection.

In this paper, we develop deeper insights into the GCN model

and address its fundamental limits. First, we show that the

graph convolution of the GCN model is actually a special

form of Laplacian smoothing, which is the key reason why

GCNs work, but it also brings potential concerns of over-

smoothing with many convolutional layers. Second, to over-

come the limits of the GCN model with shallow architectures,

we propose both co-training and self-training approaches to

train GCNs. Our approaches signiﬁcantly improve GCNs in

learning with very few labels, and exempt them from requir-

ing additional labels for validation. Extensive experiments on

benchmarks have veriﬁed our theory and proposals.

1 Introduction

The breakthroughs in deep learning have led to a paradigm

shift in artiﬁcial intelligence and machine learning. On the

one hand, numerous old problems have been revisited with

deep neural networks and huge progress has been made in

many tasks previously seemed out of reach, such as machine

translation and computer vision. On the other hand, new

techniques such as geometric deep learning (Bronstein et al.

2017) are being developed to generalize deep neural models

to new or non-traditional domains.

It is well known that training a deep neural model typi-

cally requires a large amount of labeled data, which cannot

be satisﬁed in many scenarios due to the high cost of labeling

training data. To reduce the amount of data needed for train-

ing, a recent surge of research interest has focused on few-

shot learning (Lake, Salakhutdinov, and Tenenbaum 2015;

Rezende et al. 2016) – to learn a classiﬁcation model with

very few examples from each class. Closely related to few-

shot learning is semi-supervised learning, where a large

∗

Corresponding author.

 2018, Association for the Advancement of Artiﬁcial

amount of unlabeled data can be utilized to train with typi-

cally a small amount of labeled data.

Many researches have shown that leveraging unlabeled

data in training can improve learning accuracy signiﬁcantly

if used properly (Zhu and Goldberg 2009). The key issue is

to maximize the effective utilization of structural and fea-

ture information of unlabeled data. Due to the powerful fea-

ture extraction capability and recent success of deep neu-

ral networks, there have been some successful attempts to

revisit semi-supervised learning with neural-network-based

models, including ladder network (Rasmus et al. 2015),

semi-supervised embedding (Weston et al. 2008), planetoid

(Yang, Cohen, and Salakhutdinov 2016), and graph convo-

lutional networks (Kipf and Welling 2017).

The recently developed graph convolutional neural net-

works (GCNNs) (Defferrard, Bresson, and Vandergheynst

2016) is a successful attempt of generalizing the power-

ful convolutional neural networks (CNNs) in dealing with

Euclidean data to modeling graph-structured data. In their

pilot work (Kipf and Welling 2017), Kipf and Welling pro-

posed a simpliﬁed type of GCNNs, called graph convolu-

tional networks (GCNs), and applied it to semi-supervised

classiﬁcation. The GCN model naturally integrates the con-

nectivity patterns and feature attributes of graph-structured

data, and outperforms many state-of-the-art methods signif-

icantly on some benchmarks. Nevertheless, it suffers from

similar problems faced by other neural-network-based mod-

els. The working mechanisms of the GCN model for semi-

supervised learning are not clear, and the training of GCNs

still requires considerable amount of labeled data for param-

eter tuning and model selection, which defeats the purpose

for semi-supervised learning.

In this paper, we demystify the GCN model for semi-

supervised learning. In particular, we show that the graph

convolution of the GCN model is simply a special form of

Laplacian smoothing, which mixes the features of a vertex

and its nearby neighbors. The smoothing operation makes

the features of vertices in the same cluster similar, thus

greatly easing the classiﬁcation task, which is the key rea-

son why GCNs work so well. However, it also brings poten-

tial concerns of over-smoothing. If a GCN is deep with

many convolutional layers, the output features may be over-

smoothed and vertices from different clusters may become

indistinguishable. The mixing happens quickly on small

arXiv:1801.07606v1 [cs.LG] 22 Jan 2018

下载后可阅读完整内容，剩余8页未读，立即下载

城北伯庸

粉丝: 35

深入理解图卷积网络在半监督学习中的应用

Python Machine Learning: Unlock deeper insights into Machine Leaning with this

Python Deeper Insights into Machine Learning azw3

Python Deeper Insights into Machine Learning 无水印pdf

Very Deep Convolutional Networks for Large-Scale Image Recognition" by Karen Simonyan and Andrew Zisserman (2014)

U2-Net: Going deeper with nested U-structure for salient object detection的研究重点是

U2-Net: Going deeper with nested U-structure for salient object detection的研究重难点是

用中文从研究问题、研究意义、研究方法、研究内容、研究重点、研究结论和创新点等方面，概述U2-Net: Going deeper with nested U-structure for salient object detection的框架和内容。

towards deeper graph neural networks

图卷积神经网络的国外研究案例

卷积神经网络反向传播有没有推荐的论文

最新资源