推荐系统中的用户与物品嵌入正则化

需积分: 10 148 浏览量更新于2024-09-07 收藏 1.74MB PDF 举报

“User and Item Embeddings for Recommendation.pdf” 这篇论文“Regularizing Matrix Factorization with User and Item Embeddings for Recommendation”由Thanh Tran、Kyumin Lee（来自美国伍斯特理工学院）、Yiming Liao和Dongwon Lee（来自美国宾夕法尼亚州立大学）共同撰写。论文探讨了在推荐系统中利用用户和物品嵌入（User and Item Embeddings）对矩阵分解进行正则化的创新方法，旨在提高推荐的准确性和效果。在推荐系统领域，传统的矩阵分解方法如协同过滤，通过寻找用户-物品交互矩阵的低秩近似来发现隐藏的用户兴趣和物品属性。然而，这种方法往往忽视了用户和物品的复杂关联以及潜在的语义信息。作者提出了一种新的Regularized Multi-Embedding (RME)模型，该模型通过分解同时融合以下四个关键概念： 1. 用户喜欢的物品：模型试图捕捉每个用户对不同物品的偏好，构建个性化的用户嵌入。 2. 共享相似兴趣的用户：识别哪些用户对相同物品有相似的喜好，这有助于发现用户之间的社区结构。 3. 常被一起喜欢的物品：揭示物品间的相关性，比如电影类型或音乐流派，帮助推荐相似类型的物品。 4. 常被一起不喜欢的物品：理解用户可能同时避开的物品组合，这有助于避免推荐不受欢迎的搭配。实验结果表明，RME模型在显式和隐式反馈数据集上优于现有的最佳推荐模型。它显著提升了Recall@5（召回率@5）、NDCG@20（归一化累积增益@20）和MAP@10（平均精度@10）等关键指标，分别提高了5.9%到7.0%、4.3%到5.6%和7.9%到8.9%。尤其在冷启动场景下，对于交互记录最少的用户，RME模型在MovieLens-10M和MovieLens-20M数据集上相比于其他模型分别提升了NDCG@5的20.2%和29.4%，显示了其在处理新用户或稀疏数据时的优越性能。这一研究贡献在于将深度学习中的词嵌入思想引入推荐系统，并通过正则化策略优化矩阵分解，从而提升推荐的准确性和全面性。这种方法不仅增强了模型的表达能力，还能有效处理用户和物品的稀疏交互数据，有助于解决实际推荐系统中的挑战。

Regularizing Matrix Factorization with User and Item

Embeddings for Recommendation

Thanh Tran, Kyumin Lee

Worcester Polytechnic Institute, USA

{tdtran,kmlee}@wpi.edu

Yiming Liao, Dongwon Lee

Penn State University, USA

{yiming,dongwon}@psu.edu

ABSTRACT

Following recent successes in exploiting both latent factor and word

embedding models in recommendation, we propose a novel Regu-

larized Multi-Embedding (RME) based recommendation model that

simultaneously encapsulates the following ideas via decomposition:

(1) which items a user likes, (2) which two users co-like the same

items, (3) which two items users often co-liked, and (4) which two

items users often co-disliked. In experimental validation, the RME

outperforms competing state-of-the-art models in both explicit and

implicit feedback datasets, signicantly improving Recall@5 by

5.9

∼

7.0%, NDCG@20 by 4.3

∼

5.6%, and MAP@10 by 7.9

∼

8.9%. In

addition, under the cold-start scenario for users with the lowest

number of interactions, against the competing models, the RME

outperforms NDCG@5 by 20.2% and 29.4% in MovieLens-10M and

MovieLens-20M datasets, respectively. Our datasets and source

code are available at: https://github.com/thanhdtran/RME.git.

KEYWORDS

Recommendation; item embeddings; user embeddings; negative

sampling; collaborative ltering.

ACM Reference Format:

Thanh Tran, Kyumin Lee and Yiming Liao, Dongwon Lee. 2018. Regularizing

Matrix Factorization with User and Item Embeddings for Recommendation.

In The 27th ACM International Conference on Information and Knowledge

Management (CIKM ’18), October 22–26, 2018, Torino, Italy. ACM, New York,

NY, USA, 10 pages. https://doi.org/10.1145/3269206.3271730

1 INTRODUCTION

Among popular Collaborative Filtering (CF) methods in recommen-

dation [

], in recent years, latent factor models (LFM)

using matrix factorization have been widely used. LFM are known to

yield relatively high prediction accuracy, are language independent,

and allow additional side information to be easily incorporated and

decomposed together [

]. However, most of conventional LFM

only exploited positive feedback while neglected negative feedback

and treated them as missing data [8, 14, 27, 34].

In movie recommender systems, it was observed that many users

who enjoyed watching Thor: The Dark World, also enjoyed Thor:

Ragnarok. In this case, Thor: The Dark World and Thor: Ragnarok

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

CIKM ’18, October 22–26, 2018, Torino, Italy

ACM ISBN 978-1-4503-6014-2/18/10.. . $15.00

https://doi.org/10.1145/3269206.3271730

Users

Liked)items

items

Co-liked)it ems

Co-liked+item+ co-occurrence+ matrix

User+co-occurrence+ matrix

items

Co-disliked) items

Co-dis lik ed+ item+ co-occurrence+ m atrix

p2 p3

Figure 1: An overview of our RME Model, which jointly

decomposes user-item interaction matrix, co-liked item co-

occurrence matrix, co-disliked item co-occurrence matrix,

and user co-occurrence matrix. (V : liked, X : disliked, and

unknown)

can be seen as a pair of co-liked movies. So, if a user preferred Thor:

The Dark World but never watch Thor: Ragnarok, the system can

precisely recommend Thor: Ragnarok to her (

rst observation

Similarly, if two users A and B liked the same movies, we can

assume A and B have the same movie interests. If user A likes a

movie that B has never watched, the system can recommend the

movie to B (

second observation

). In the same manner, we ask if co-

occurred disliked movies can provide any meaningful information.

We observed that most users, who rated Pledge This! poorly (0.8/5.0

on average), also gave a low rating to Run for Your Wife (1.3/5.0 on

average). If the disliked co-occurrence pattern was exploited, Run

for Your Wife would not be recommended to other users who did not

enjoy Pledge This! (

third observation

). This will help reduce the

false positive rate for recommender systems. The same phenomena

would have also occurred in other recommendation domains.

The rst two observations are similar to the basic assumptions of

item CF and user CF where similar scores between items/users are

used to infer the next recommended items for users. Unfortunately,

only the rst two observations have been exploited in conventional

CF. While treating the negative-feedback items dierently from

missing data led to better results [

], to the best of our knowledge,

no previous works exploited the

third observation

to enhance

the recommender systems’ performance.

Therefore, in this paper, we attempt to exploit all three observa-

tions in one model to achieve better recommendation results. With

the recent success of word embedding techniques in natural lan-

guage processing, if we consider pairs of co-occurred liked/disliked

items or pairs of co-occurred users as pairs of co-occurred words,

we can apply word embedding to learn latent representations of

items (e.g., item embeddings) and users (e.g. user embeddings).

Based on this, we propose a Regularized Multi-Embedding based

Session 4E: Recommendation 1

CIKM’18, October 22-26, 2018, Torino, Italy

687

下载后可阅读完整内容，剩余9页未读，立即下载

Jayxp

粉丝: 6
资源: 137

推荐系统中的用户与物品嵌入正则化

PyPI 官网下载 | text-embeddings-0.1.1.tar.gz

X-VECTORS ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION 中文.pdf

Learning Conceptual-Contextual Embeddings for Medical Text.pdf

Dynamic Meta-Embeddings for Improved Sentence Representations.pdf

Context-Aware Embeddings for Automatic Art Analysis.pdf

Python库 | image_embeddings-1.3.1.tar.gz

PyPI 官网下载 | text-embeddings-0.0.5.tar.gz

word-embeddings-subset.p

Python库 | cf_text_embeddings-0.1.1.tar.gz

Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning.pdf

最新资源