跨域用户建模的多视图深度学习：推荐系统的创新解决方案

需积分: 6 65 浏览量更新于2024-09-03 收藏 261KB PDF 举报

本文主要探讨了"多视图深度学习方法在跨领域用户建模中的应用"（A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems），由Ali Elkahky等人提出。在当前的在线服务中，个性化推荐对于提升用户体验和内容相关性至关重要。为了应对新用户的大量涌入并保持系统的高效扩展性，研究者们寻求一种既能保证推荐质量又能适应大规模用户的新策略。作者提出了一种基于内容的推荐系统，该系统的核心是利用深度学习技术来处理用户行为数据，如浏览历史和搜索查询，将用户和物品映射到一个隐含的低维空间。在这个空间中，用户与其偏好物品之间的相似度被最大化，从而实现更精确的个性化推荐。这种方法强调了深度学习在用户特征表示和项目特征提取方面的优势。进一步地，为了充分利用来自不同领域的项目特征，研究者引入了多视图深度学习模型。这种模型允许系统联合学习来自不同领域的项目特征与用户的个体特征，增强了推荐的全面性和准确性。通过这种方式，即使在数据量庞大的情况下，系统也能通过有效的降维技术保持高效的性能，并处理复杂的数据结构。总结来说，这篇论文的重点在于开发一种能够跨域整合用户和项目特征的深度学习框架，通过多视角的学习，提升推荐系统的适应性和准确性。这种方法对于在线服务的实时性和个性化推荐具有重要意义，为处理大规模用户和多源数据的推荐问题提供了一种新颖且实用的解决方案。同时，它也为未来的推荐系统研究提供了有价值的技术参考和实践指导。

based collaborative ﬁltering ﬁnds a common space for items

and users based on user-item matrix and combines the item

and user representation to ﬁnd a recommendation. All ma-

trix factorization approaches like [19] and [21] are examples

of this technique. CF can be extended to large-scale setup s

like in [6]. However, CF is generally unable to handle new

users and new items, a problem which is often referred to as

cold-start issue.

The second approach for recommendation systems is content-

based recommendation. This approach extracts features

from item’s and/or user’s proﬁle and recommend items to

users according to these features. The underlying assump-

tion is that similar users tend to like items similar t o the

items they liked previously. In [14], a method is proposed to

construct a search query with some features of items the user

liked before to ﬁ nd other relevant items to recommend. An-

other ex amp le is presented in [15] where each user is modeled

by a distribution over News topics that is constructed from

articles she liked with a prior distribu tion of topic preference

computed using all users who share the same location. This

approach can handle new items (News articles) but for new

users the system used location feature only which implies

that new users are expected to see most frequent topics in

their location. This might be a good features to recommend

News but in other domains, for example Apps recommen-

dation, u sing only location information may not work as a

good prior over user’s preferences.

Recently, researchers have developed approaches that com-

bine both collaborative recommendation and content based

recommendation. In [16], the author used item features to

smooth user data before using collaborative ﬁltering. In [7],

the authors used Restricted Boltzmann Machine to learn

similarity between items, and then combined this with col-

laborative ﬁ ltering. A Bayesian approach was developed in

[32] to jointly learn the distribution of items, research pa-

pers in their case, over diﬀerent components (topics) an d

the factorization of the rating matrix.

Handling the cold start issue in recommendation systems

is studied mainly for new items (items t hat h ave no rating

by any user). As we mentioned before, all content based

ﬁltering can handle cold start for item, and there are some

metho ds that were developed and evaluated speciﬁcally for

this issue like in [24] and [7]. The work in [18] studied how

to learn user preferences for new users incrementally by rec-

ommending items that give the most information about user

preferences while minimizing the probability of recommend-

ing irrelevant content. User modeling via rich features have

been studied a lot recently. For example, it has been shown

that user search queries can be used to discover the similari-

ties between users [25]. Rich features from user search histo-

ry has also been used for personalized web search [26]. For

recommendation systems, the authors in [2] leveraged the

user’s historical search queries to build personalized taxono-

my for recommending Ads. On the other hand, researchers

have discovered that a user’s social behaviors can also b e

used to build the proﬁle of the user. In [1], the authors used

user’s tweets in Twitter data to recommend News articles.

Most t rad itional recommendation system research focused

on data within a single domain. Recently, t here has been an

increasing interest in cross domain recommendation. There

are diﬀerent approaches for addressing cross domain rec-

ommendation. One approach is to assume that diﬀerent

domains share similar set of users but not the items, as il-

lustrated in [20]. In t heir work, the authors augmented data

from rating of movies and books from datasets that have

common u sers. The augmented data set was then used to

perform collaborative ﬁ ltering. They showed that th is in

particular helped the cases where users with little proﬁle

information in one of the domains (cold-start users). The

second approach addressed the scenarios where the same

set of items shared diﬀerent types of feedbacks in diﬀerent

domains like user clicks or user explicit rating. As shown

in [17], the authors introduced a coordinate system trans-

fer metho d for cross domain matrix factorization. In [12],

the authors studied the cross domain recommendation in

the case where there existed no shared users or items be-

tween domains. They developed a generative model to dis-

cover common clusters between diﬀerent domains. However,

a challenge in their approach is its ability to scale beyond

medium datasets due to the computational cost. A diﬀerent

approach was introduce in [28] for author collaboration rec-

ommendation where they built a topic model t o recommend

authors to collaborate from diﬀerent research ﬁelds.

For many approaches in recommendation systems the ob-

jective function is to minimize the root mean squared error

on the user-item matrix reconstruction. Recently, ranking

based objective function has shown to be more eﬀective in

giving better recommendation as shown in [11].

Deep learning has recently been proposed for building rec-

ommendation systems for both collaborative and content

based approaches. In [22], an RBM model was used for

collaborative ﬁltering. Deep learning for content based rec-

ommendation has been done for example in [30] where deep

learning was applied to learn emb edding for music features.

This embedding was then used to regularize matrix factor-

ization in collaborative ﬁltering.

3. DESCRIPTION OF THE DATA SETS

In this section introduces the data sets. We describe the

data collection process and the feature representations for

each data set, as well as some basic statistics of the data.

The four data sets used in this study were collected from

user logs of several Microsoft products, including (1) Search

engine logs from Bing Web vertical, (2) News article brows-

ing history from Bing News vertical, (3) App download logs

from Windows AppStore, and (4) Movie/TV view logs from

Xbox. All the logs were collected between December 2013

and June 2014, with primary focus on English-speaking mar-

kets including United States, Canada and Great Britain.

(User Features) We collected users’ search queries and

their clicked URLs from Bing to form user features. Queries

were ﬁrst normalized, stemmed and then split into unigram

features and URLs were shorten into domain-level only (e.g.,

www.linkedin.com) to reduce the feature dimension. We

then used TF-IDF scores to keep only the most popular and

non-trivial features. Overall, we selected 3 million unigram

features and 500K domain features, leading to a total length

of 3.5-million user feature vector.

(News Features) We collected news article clicks from

Bing News vertical. Each News item is represented by three

parts of features. The ﬁrst part is the title features encoded

using letter tri-gram representation as we will describe in

the next section. Secondly, the top-level category of each

News (e.g., Entertainment) is encoded as binary features.

Finally, the N amed Entities in each article, extracted using

剩余10页未读，继续阅读

diyi6976

粉丝: 2
资源: 33

跨域用户建模的多视图深度学习：推荐系统的创新解决方案

A multi-paradigm decision modeling framework for combat system effectiveness measurement based on domain-specific modeling

Pornographic images recognition based on spatial pyramid partition and multi-instance ensemble learning

CPM-Nets: Cross Partial Multi-View Networks

Multi-agent based modeling and simulating for evacuation process in stadium

Pro-Deep-Learning-with-TensorFlow.pdf

2005 B O A Quasi-Sequential Cellular-Automaton Approach to Traffic Modeling.pdf

Udemy - Deep Learning Convolutional Neural Networks in Python

Udemy - Deep Learning Recurrent Neural Networks in Python

[AI for Computer Games and Animation - A Cognitive Modeling Approach][EN].pdf

Markov game for autonomic joint radio resource management in a multi-operator scenario (2007年)

最新资源