大规模推荐系统中的图卷积神经网络：Pinterest的PinSage

需积分: 22 69 浏览量更新于2024-09-07 收藏 9.84MB PDF 举报

"这篇文章是关于2018年在大规模推荐系统中应用图卷积神经网络（Graph Convolutional Neural Networks, GCNs）的研究。研究团队来自Pinterest和斯坦福大学，他们提出了一种名为PinSage的数据高效算法，用于解决在拥有数十亿物品和数亿用户的大规模网络推荐任务中的挑战。" 在推荐系统领域，深度神经网络的进步显著提升了推荐性能，尤其是在图结构数据上的应用。然而，将这些方法实际应用并扩展到互联网级别的推荐任务仍然面临挑战，因为这需要处理海量的项目和用户。针对这个问题，研究者们开发了PinSage，这是一种数据效率高的图卷积网络算法。 PinSage通过结合高效的随机游走和图卷积来生成节点（即物品）的嵌入表示，同时考虑了图结构以及节点特征信息。相比于之前的GCN方法，PinSage创新性地使用高度有效的随机游走来组织卷积操作，设计了一种新的训练策略。这种方法优化了计算效率，使其更适合处理大规模数据集。在图卷积网络中，信息通常通过图的邻接关系进行传播，从而获取节点的上下文信息。PinSage的随机游走策略允许在网络中有效地探索邻接节点，减少了计算复杂性。同时，通过结合节点的特征信息，如用户的历史行为、物品的属性等，模型能够更好地理解用户与物品之间的关联性，从而提高推荐的准确性和多样性。此外，PinSage的训练策略考虑了在大规模数据集上训练的可行性。传统的梯度下降方法可能在处理海量数据时遇到效率问题，而PinSage的训练策略可能包括批量采样、分布式计算等优化手段，以确保在保持模型性能的同时，降低计算资源的需求。 2018年的这篇论文介绍了如何利用图卷积神经网络构建一个适用于大规模推荐系统的深度学习引擎。PinSage算法的成功实施和部署在Pinterest上，展示了在处理复杂网络结构和大量用户数据时，GCNs的强大潜力和实用性，对于后续的推荐系统研究和实践具有重要指导意义。

Graph Convolutional Neural Networks for Web-Scale

Recommender Systems

Rex Ying

∗†

, Ruining He

∗

, Kaifeng Chen

∗†

, Pong Eksombatchai

∗

William L. Hamilton

†

, Jure Leskovec

∗†

∗

Pinterest,

†

Stanford University

{rhe,kaifengchen,pong}@pinterest.com,{rexying,wleif,jure}@stanford.edu

ABSTRACT

Recent advancements in deep neural networks for graph-structured

data have led to state-of-the-art performance on recommender

system benchmarks. However, making these methods practical and

scalable to web-scale recommendation tasks with billions of items

and hundreds of millions of users remains a challenge.

Here we describe a large-scale deep recommendation engine

that we developed and deployed at Pinterest. We develop a data-

ecient Graph Convolutional Network (GCN) algorithm PinSage,

which combines ecient random walks and graph convolutions

to generate embeddings of nodes (i.e., items) that incorporate both

graph structure as well as node feature information. Compared to

prior GCN approaches, we develop a novel method based on highly

ecient random walks to structure the convolutions and design a

novel training strategy that relies on harder-and-harder training

examples to improve robustness and convergence of the model.

We deploy PinSage at Pinterest and train it on 7.5 billion exam-

ples on a graph with 3 billion nodes representing pins and boards,

and 18 billion edges. According to oine metrics, user studies and

A/B tests, PinSage generates higher-quality recommendations than

comparable deep learning and graph-based alternatives. To our

knowledge, this is the largest application of deep graph embed-

dings to date and paves the way for a new generation of web-scale

recommender systems based on graph convolutional architectures.

ACM Reference Format:

Rex Ying

∗†

, Ruining He

∗

, Kaifeng Chen

∗†

, Pong Eksombatchai

∗

, William L.

Hamilton

†

, Jure Leskovec

∗†

. 2018. Graph Convolutional Neural Networks

for Web-Scale Recommender Systems. In KDD ’18: The 24th ACM SIGKDD

International Conference on Knowledge Discovery & Data Mining, August

19–23, 2018, London, United Kingdom. ACM, New York, NY, USA, 10 pages.

https://doi.org/10.1145/3219819.3219890

1 INTRODUCTION

Deep learning methods have an increasingly critical role in rec-

ommender system applications, being used to learn useful low-

dimensional embeddings of images, text, and even individual users

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

KDD ’18, August 19–23, 2018, London, United Kingdom

ACM ISBN 978-1-4503-5552-0/18/08.. .$15.00

https://doi.org/10.1145/3219819.3219890

[

]. The representations learned using deep models can be used

to complement, or even replace, traditional recommendation algo-

rithms like collaborative ltering. and these learned representations

have high utility because they can be re-used in various recom-

mendation tasks. For example, item embeddings learned using a

deep model can be used for item-item recommendation and also to

recommended themed collections (e.g., playlists, or “feed” content).

Recent years have seen signicant developments in this space—

especially the development of new deep learning methods that are

capable of learning on graph-structured data, which is fundamen-

tal for recommendation applications (e.g., to exploit user-to-item

interaction graphs as well as social graphs) [6, 19, 21, 24, 29, 30].

Most prominent among these recent advancements is the suc-

cess of deep learning architectures known as Graph Convolutional

Networks (GCNs) [

]. The core idea behind GCNs is

to learn how to iteratively aggregate feature information from lo-

cal graph neighborhoods using neural networks (Figure 1). Here a

single “convolution” operation transforms and aggregates feature

information from a node’s one-hop graph neighborhood, and by

stacking multiple such convolutions information can be propagated

across far reaches of a graph. Unlike purely content-based deep

models (e.g., recurrent neural networks [

]), GCNs leverage both

content information as well as graph structure. GCN-based methods

have set a new standard on countless recommender system bench-

marks (see [

] for a survey). However, these gains on benchmark

tasks have yet to be translated to gains in real-world production

environments.

The main challenge is to scale both the training as well as in-

ference of GCN-based node embeddings to graphs with billions of

nodes and tens of billions of edges. Scaling up GCNs is dicult

because many of the core assumptions underlying their design are

violated when working in a big data environment. For example,

all existing GCN-based recommender systems require operating

on the full graph Laplacian during training—an assumption that

is infeasible when the underlying graph has billions of nodes and

whose structure is constantly evolving.

Present work.

Here we present a highly-scalable GCN framework

that we have developed and deployed in production at Pinterest. Our

framework, a random-walk-based GCN named PinSage, operates

on a massive graph with 3 billion nodes and 18 billion edges—a

graph that is 10

000

larger than typical applications of GCNs.

PinSage leverages several key insights to drastically improve the

scalability of GCNs:

arXiv:1806.01973v1 [cs.IR] 6 Jun 2018

下载后可阅读完整内容，剩余9页未读，立即下载

Jayxp

粉丝: 6
资源: 137

大规模推荐系统中的图卷积神经网络：Pinterest的PinSage

GRAPH CONVOLUTIONAL NETWORKS.pdf

A Comprehensive Survey on Graph Neural Network（2019）.pdf

核方法 graph kernel

图神经网络研讨 - Graph convolutional neural networks.pdf

2018-arXiv-Graph Neural Networks A Review of Methods and Applications.pdf

图机器学习峰会-1-1 Graph Neural Networks for Learning Simulations.pdf

图机器学习峰会-1-1 Graph Neural Networks for Learning Simulations.zip

lzy-Intro-to-Graph-Neural-Networks.zip

2019-KDD-DEMO-Net Degree-specific Graph Neural Networks for Node

2019-ICLR-Confidence-based Graph Convolutional Networks for Semi

最新资源