Graph Convolutional Neural Networks for Web-Scale
Recommender Systems
Rex Ying
∗†
, Ruining He
∗
, Kaifeng Chen
∗†
, Pong Eksombatchai
∗
,
William L. Hamilton
†
, Jure Leskovec
∗†
∗
Pinterest,
†
Stanford University
{rhe,kaifengchen,pong}@pinterest.com,{rexying,wleif,jure}@stanford.edu
ABSTRACT
Recent advancements in deep neural networks for graph-structured
data have led to state-of-the-art performance on recommender
system benchmarks. However, making these methods practical and
scalable to web-scale recommendation tasks with billions of items
and hundreds of millions of users remains a challenge.
Here we describe a large-scale deep recommendation engine
that we developed and deployed at Pinterest. We develop a data-
ecient Graph Convolutional Network (GCN) algorithm PinSage,
which combines ecient random walks and graph convolutions
to generate embeddings of nodes (i.e., items) that incorporate both
graph structure as well as node feature information. Compared to
prior GCN approaches, we develop a novel method based on highly
ecient random walks to structure the convolutions and design a
novel training strategy that relies on harder-and-harder training
examples to improve robustness and convergence of the model.
We deploy PinSage at Pinterest and train it on 7.5 billion exam-
ples on a graph with 3 billion nodes representing pins and boards,
and 18 billion edges. According to oine metrics, user studies and
A/B tests, PinSage generates higher-quality recommendations than
comparable deep learning and graph-based alternatives. To our
knowledge, this is the largest application of deep graph embed-
dings to date and paves the way for a new generation of web-scale
recommender systems based on graph convolutional architectures.
ACM Reference Format:
Rex Ying
∗†
, Ruining He
∗
, Kaifeng Chen
∗†
, Pong Eksombatchai
∗
, William L.
Hamilton
†
, Jure Leskovec
∗†
. 2018. Graph Convolutional Neural Networks
for Web-Scale Recommender Systems. In KDD ’18: The 24th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining, August
19–23, 2018, London, United Kingdom. ACM, New York, NY, USA, 10 pages.
https://doi.org/10.1145/3219819.3219890
1 INTRODUCTION
Deep learning methods have an increasingly critical role in rec-
ommender system applications, being used to learn useful low-
dimensional embeddings of images, text, and even individual users
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
KDD ’18, August 19–23, 2018, London, United Kingdom
© 2018 Association for Computing Machinery.
ACM ISBN 978-1-4503-5552-0/18/08.. .$15.00
https://doi.org/10.1145/3219819.3219890
[
9
,
12
]. The representations learned using deep models can be used
to complement, or even replace, traditional recommendation algo-
rithms like collaborative ltering. and these learned representations
have high utility because they can be re-used in various recom-
mendation tasks. For example, item embeddings learned using a
deep model can be used for item-item recommendation and also to
recommended themed collections (e.g., playlists, or “feed” content).
Recent years have seen signicant developments in this space—
especially the development of new deep learning methods that are
capable of learning on graph-structured data, which is fundamen-
tal for recommendation applications (e.g., to exploit user-to-item
interaction graphs as well as social graphs) [6, 19, 21, 24, 29, 30].
Most prominent among these recent advancements is the suc-
cess of deep learning architectures known as Graph Convolutional
Networks (GCNs) [
19
,
21
,
24
,
29
]. The core idea behind GCNs is
to learn how to iteratively aggregate feature information from lo-
cal graph neighborhoods using neural networks (Figure 1). Here a
single “convolution” operation transforms and aggregates feature
information from a node’s one-hop graph neighborhood, and by
stacking multiple such convolutions information can be propagated
across far reaches of a graph. Unlike purely content-based deep
models (e.g., recurrent neural networks [
3
]), GCNs leverage both
content information as well as graph structure. GCN-based methods
have set a new standard on countless recommender system bench-
marks (see [
19
] for a survey). However, these gains on benchmark
tasks have yet to be translated to gains in real-world production
environments.
The main challenge is to scale both the training as well as in-
ference of GCN-based node embeddings to graphs with billions of
nodes and tens of billions of edges. Scaling up GCNs is dicult
because many of the core assumptions underlying their design are
violated when working in a big data environment. For example,
all existing GCN-based recommender systems require operating
on the full graph Laplacian during training—an assumption that
is infeasible when the underlying graph has billions of nodes and
whose structure is constantly evolving.
Present work.
Here we present a highly-scalable GCN framework
that we have developed and deployed in production at Pinterest. Our
framework, a random-walk-based GCN named PinSage, operates
on a massive graph with 3 billion nodes and 18 billion edges—a
graph that is 10
,
000
×
larger than typical applications of GCNs.
PinSage leverages several key insights to drastically improve the
scalability of GCNs:
arXiv:1806.01973v1 [cs.IR] 6 Jun 2018