STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for
Recommender Systems
Jiani Zhang
1
, Xingjian Shi
2
, Shenglin Zhao
3
and Irwin King
1
1
The Chinese University of Hong Kong, Hong Kong, China
2
Hong Kong University of Science and Technology, Hong Kong, China
3
Youtu Lab, Tencent, Shenzhen, China
{jnzhang, king}@cse.cuhk.edu.hk, xshiab@connect.ust.hk, zsl.zju@gmail.com
Abstract
We propose a new
STA
cked and
R
econstructed
G
raph
C
onvolutional
N
etworks (STAR-GCN) ar-
chitecture to learn node representations for boost-
ing the performance in recommender systems, es-
pecially in the cold start scenario. STAR-GCN em-
ploys a stack of GCN encoder-decoders combined
with intermediate supervision to improve the final
prediction performance. Unlike the graph convo-
lutional matrix completion model with one-hot en-
coding node inputs, our STAR-GCN learns low-
dimensional user and item latent factors as the input
to restrain the model space complexity. Moreover,
our STAR-GCN can produce node embeddings for
new nodes by reconstructing masked input node em-
beddings, which essentially tackles the cold start
problem. Furthermore, we discover a label leak-
age issue when training GCN-based models for
link prediction tasks and propose a training strat-
egy to avoid the issue. Empirical results on multi-
ple rating prediction benchmarks demonstrate our
model achieves state-of-the-art performance in four
out of five real-world datasets and significant im-
provements in predicting ratings in the cold start
scenario. The code implementation is available in
https://github.com/jennyzhang0215/STAR-GCN.
1 Introduction
Recommender systems, which analyze users’ preference pat-
terns to suggest potential targets, are indispensable in content
providers, electronic retailers, web search engines, etc. The
key mathematical problem underlying recommender systems
is matrix completion
[
Cand
`
es and Recht, 2009
]
. Assume there
are
N
users and
M
items, the recommendation algorithm aims
to fill in the missing entries in the
N × M
rating matrix given
the existing entries.
The classical way to solve this problem is via Matrix Factor-
ization (MF)
[
Koren et al., 2009
]
, in which the rating scores are
generated by functions over the latent factors or embeddings
of users and items. Recent advancements in deep learning,
especially Graph Convolutional Networks (GCN)
[
Defferrard
et al., 2016; Bronstein et al., 2017; Kipf and Welling, 2017;
Hamilton et al., 2017
]
, have brought new ideas for tackling
this essential artificial intelligence problem. GCN general-
izes the definition of convolution from the regular grid to
irregular grid, like graph structures. The GCN framework gen-
erates node representations by a localized parameter-sharing
operator, known as graph aggregator
[
Hamilton et al., 2017;
Zhang et al., 2018
]
. A graph aggregator calculates a node’s
representation by transforming and aggregating the features of
its local neighborhoods. By stacking multiple graph aggrega-
tors and nonlinear functions, we build a deep neural network
that can extract features across far reaches of a graph. Because
the local neighborhood set can be viewed as the receptive field
of a convolution kernel, this kind of neighborhood aggrega-
tion methods is named as graph convolution, which also have
connections to spectral graph theory
[
Kipf and Welling, 2017
]
.
Monti et al.
[
2017
]
proposed the first GCN-based method
for recommender systems. In their approach, GCN was used
to aggregate information from two auxiliary user-user and
item-item graphs. The latent factors of users and items were
updated after each aggregation step, and a combined objective
function of GCN and MF was used to train the model. After
that, Berg et al.
[
2017
]
proposed the Graph Convolutional
Matrix Completion (GC-MC) model. GC-MC directly charac-
terized the relationship between users and items as a bipartite
interaction graph. Two multi-link graph convolution layers
were used to aggregate user features and item features. The rat-
ings were estimated by predicting the edge labels. Thanks to
the power of GCN in learning high-quality user and item repre-
sentations, GC-MC has achieved state-of-the-art performance
in several public recommendation benchmarks.
While being powerful, the GC-MC model has two signif-
icant limitations. To distinguish each node, the model uses
one-hot vectors as node input. This makes the input dimen-
sionality proportional to the total number of nodes and thus
is not scalable to large graphs. Moreover, the model is unable
to predict the ratings for new users or items that are not seen
in the training phase because we cannot represent unknown
nodes as one-hot vectors. The task of predicting ratings for
new users or items is also known as the cold start problem.
In this paper, we propose a new architecture,
STA
cked and
R
econstructed
G
raph
C
onvolutional
N
etworks (STAR-GCN),
to solve these problems. Unlike GC-MC, STAR-GCN directly
learns low-dimensional user and item embeddings as the in-
put to the network in an end-to-end fashion. To improve the
learned embeddings and also generalize the model to predict
arXiv:1905.13129v1 [cs.IR] 27 May 2019