Adversarial Graph Contrastive Learning
with Information Regularization
Shengyu Feng
shengyu8@illinois.edu
University of Illinois at Urbana-Champaign
USA
Baoyu Jing
baoyuj2@illinois.edu
University of Illinois at Urbana-Champaign
USA
Yada Zhu
yzhu@us.ibm.com
IBM Research
USA
Hanghang Tong
htong@illinois.edu
University of Illinois at Urbana-Champaign
USA
ABSTRACT
Contrastive learning is an eective unsupervised method in graph
representation learning. Recently, the data augmentation based con-
trastive learning method has been extended from images to graphs.
However, most prior works are directly adapted from the models
designed for images. Unlike the data augmentation on images, the
data augmentation on graphs is far less intuitive and much harder
to provide high-quality contrastive samples, which are the key to
the performance of contrastive learning models. This leaves much
space for improvement over the existing graph contrastive learning
frameworks. In this work, by introducing an adversarial graph view
and an information regularizer, we propose a simple but eective
method, Adversarial Graph Contrastive Learning (ArieL), to extract
informative contrastive samples within a reasonable constraint. It
consistently outperforms the current graph contrastive learning
methods in the node classication task over various real-world
datasets and further improves the robustness of graph contrastive
learning.
CCS CONCEPTS
• Mathematics of computing → Information theory
;
Graph
algorithms
;
• Computing methodologies → Neural networks
;
Learning latent representations.
KEYWORDS
graph representation learning, contrastive learning, adversarial
training, mutual information
ACM Reference Format:
Shengyu Feng, Baoyu Jing, Yada Zhu, and Hanghang Tong. 2022. Adversarial
Graph Contrastive Learning with Information Regularization. In Proceedings
of the ACM Web Conference 2022 (WWW ’22), April 25–29, 2022, Virtual Event,
Lyon, France. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/
3485447.3512183
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
WWW ’22, April 25–29, 2022, Virtual Event, Lyon, France.
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9096-5/22/04.. . $15.00
https://doi.org/10.1145/3485447.3512183
1 INTRODUCTION
Contrastive learning is a widely used technique in various graph
representation learning tasks. In contrastive learning, the model
tries to minimize the distances among positive pairs and maximize
the distances among negative pairs in the embedding space. The
denition of positive and negative pairs is the key component
in contrastive learning. Earlier methods like DeepWalk [
24
] and
node2vec [
6
] dene positive and negative pairs based on the co-
occurrence of node pairs in the random walks. For knowledge graph
embedding, it is a common practice to dene positive and negative
pairs based on translations [2, 11, 18, 33, 34, 36].
Recently, the breakthroughs of contrastive learning in computer
vision have inspired some works to apply the similar ideas from
visual representation learning to graph representation learning.
To name a few, Deep Graph Infomax (DGI) [
32
] extends Deep In-
foMax [
9
] and achieves signicant improvements over previous
random-walk based methods. Graphical Mutual Information (GMI)
[
23
] uses the same framework as DGI but generalizes the concept
of mutual information from vector space to graph domain. Con-
trastive multi-view graph representation learning (referred to as
MVGRL in this paper) [
7
] further improves DGI by introducing
graph diusion into the contrastive learning framework. The more
recent works often follow the data augmentation based contrastive
learning methods [
4
,
8
], which treat the data augmented samples
from the same instance as positive pairs and dierent instances as
negative pairs. Graph Contrastive Coding (GCC) [
25
] uses random
walks with restart [
29
] to generate two subgraphs for each node
as two data augmented samples. Graph Contrastive learning with
Adaptive augmentation (GCA) [
41
] introduces an adaptive data
augmentation method which perturbs both the node features and
edges according to their importance, and it is trained in a similar
way as the famous visual contrastive learning framework SimCLR
[
4
]. Its preliminary work, which uses the uniform random sampling
rather than the adaptive sampling, is referred to as GRACE [
40
]
in this paper. Robinson et al. [
26
] proposes a way to select hard
negative samples based on the embedding space distances, and use
it to obtain high-quality graph embedding. There are also many
works [
38
,
39
] systemically studying the data augmentation on the
graphs.
However, unlike the transformations on images, the transforma-
tions on graphs are far less intuitive to human beings. The data
augmentation on the graph, could be either too similar to or totally
arXiv:2202.06491v1 [cs.LG] 14 Feb 2022