DeepFM：融合因子分解机与深度学习的CTR预测模型

需积分: 50 199 浏览量更新于2024-09-03 1 收藏 1.13MB PDF 举报

"这篇论文《DeepFM: A Factorization-Machine based Neural Network for CTR Prediction (2017)》探讨了如何在推荐系统中更有效地预测点击率(CTR)。作者提出了一种名为DeepFM的新模型，该模型结合了因子分解机（Factorization Machine）和深度神经网络（Deep Neural Network）的优势，旨在捕捉低阶和高阶特征交互效应，而无需复杂的特征工程。" 在推荐系统中，理解用户行为背后的复杂特征交互是提高点击率的关键。传统的推荐算法通常倾向于关注低阶或高阶特征交互，但往往忽视了它们之间的联系。为解决这一问题，DeepFM模型应运而生。它将因子分解机的特性与深度学习的特征学习能力融合在一个全新的神经网络架构中。因子分解机在推荐系统中被广泛用于捕捉二阶特征交互，它可以有效地表示特征间的隐含关系。然而，因子分解机在处理高阶特征交互时可能力有未逮。深度神经网络(DNN)则擅长学习非线性特征组合，特别是在处理高阶特征交互时表现出色。DeepFM将这两者结合，通过共享输入层，使得“宽”部分（即因子分解机）和“深”部分（即深度神经网络）能够共同学习和理解特征交互。相比谷歌提出的最新Wide&Deep模型，DeepFM具有以下优势：首先，DeepFM的“宽”和“深”两部分共享输入，减少了对特征工程的依赖，只需原始特征即可；其次，DeepFM能够同时优化低阶和高阶特征交互，这使得模型在捕获不同层次特征组合上更具灵活性和准确性。论文通过全面的实验验证了DeepFM的有效性，不仅在多个数据集上优于其他流行模型，而且在训练效率和泛化性能方面也表现出色。这表明DeepFM是一种有潜力的推荐系统模型，能够为提升CTR预测提供更强大的工具。 DeepFM为推荐系统的CTR预测带来了新的解决方案，强调了特征交互的重要性，并降低了对人工特征工程的依赖。这一模型的创新之处在于其将传统机器学习方法与深度学习技术相结合，为业界提供了更高效、更智能的推荐系统设计思路。

展开

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

Huifeng Guo

∗1

, Ruiming Tang

, Yunming Ye

†1

, Zhenguo Li

, Xiuqiang He

Shenzhen Graduate School, Harbin Institute of Technology, China

Noah’s Ark Research Lab, Huawei, China

huifengguo@yeah.net, yeyunming@hit.edu.cn

{tangruiming, li.zhenguo, hexiuqiang}@huawei.com

Abstract

Learning sophisticated feature interactions behind

user behaviors is critical in maximizing CTR for

recommender systems. Despite great progress, ex-

isting methods seem to have a strong bias towards

low- or high-order interactions, or require exper-

tise feature engineering. In this paper, we show

that it is possible to derive an end-to-end learn-

ing model that emphasizes both low- and high-

order feature interactions. The proposed model,

DeepFM, combines the power of factorization ma-

chines for recommendation and deep learning for

feature learning in a new neural network architec-

ture. Compared to the latest Wide & Deep model

from Google, DeepFM has a shared input to its

“wide” and “deep” parts, with no need of feature

engineering besides raw features. Comprehensive

experiments are conducted to demonstrate the ef-

fectiveness and efﬁciency of DeepFM over the ex-

isting models for CTR prediction, on both bench-

mark data and commercial data.

1 Introduction

The prediction of click-through rate (CTR) is critical in rec-

ommender system, where the task is to estimate the probabil-

ity a user will click on a recommended item. In many recom-

mender systems the goal is to maximize the number of clicks,

and so the items returned to a user can be ranked by estimated

CTR; while in other application scenarios such as online ad-

vertising it is also important to improve revenue, and so the

ranking strategy can be adjusted as CTR×bid across all can-

didates, where “bid” is the beneﬁt the system receives if the

item is clicked by a user. In either case, it is clear that the key

is in estimating CTR correctly.

It is important for CTR prediction to learn implicit feature

interactions behind user click behaviors. By our study in a

mainstream apps market, we found that people often down-

load apps for food delivery at meal-time, suggesting that the

(order-2) interaction between app category and time-stamp

∗

This work is done when Huifeng Guo worked as intern at

Noah’s Ark Research Lab, Huawei.

†

Corresponding Author.

Figure 1: Wide & deep architecture of DeepFM. The wide and deep

component share the same input raw feature vector, which enables

DeepFM to learn low- and high-order feature interactions simulta-

neously from the input raw features.

can be used as a signal for CTR. As a second observation,

male teenagers like shooting games and RPG games, which

means that the (order-3) interaction of app category, user gen-

der and age is another signal for CTR. In general, such inter-

actions of features behind user click behaviors can be highly

sophisticated, where both low- and high-order feature interac-

tions should play important roles. According to the insights

of the Wide & Deep model

[

Cheng et al., 2016

]

from google,

considering low- and high-order feature interactions simulta-

neously brings additional improvement over the cases of con-

sidering either alone.

The key challenge is in effectively modeling feature inter-

actions. Some feature interactions can be easily understood,

thus can be designed by experts (like the instances above).

However, most other feature interactions are hidden in data

and difﬁcult to identify a priori (for instance, the classic as-

sociation rule “diaper and beer” is mined from data, instead

of discovering by experts), which can only be captured auto-

matically by machine learning. Even for easy-to-understand

interactions, it seems unlikely for experts to model them ex-

haustively, especially when the number of features is large.

Despite their simplicity, generalized linear models, such as

FTRL

[

McMahan et al., 2013

]

, have shown decent perfor-

mance in practice. However, a linear model lacks the abil-

ity to learn feature interactions, and a common practice is

to manually include pairwise feature interactions in its fea-

ture vector. Such a method is hard to generalize to model

high-order feature interactions or those never or rarely appear

in the training data

[

Rendle, 2010

]

. Factorization Machines

arXiv:1703.04247v1 [cs.IR] 13 Mar 2017

下载后可阅读完整内容，剩余7页未读，立即下载

身份认证购VIP最低享 7 折!

30元优惠券

工藤旧一

粉丝: 228

DeepFM：融合因子分解机与深度学习的CTR预测模型

FormattingGuidelines-IJCAI-PRICAI-20 - 官方.zip

IJCAI 2017-accepted papers.pdf

IJCAI-16-accepted papers.pdf

FormattingGuidelines-IJCAI-PRICAI-20.zip

1_人工智能---读研教育.pdf

IJCAI2017-Customer-Flow-Forecasts-on-Koubei.com:IJCAI2017 天池大赛-客流量预测 rank 9th

2023年全球最具影响力人工智能学者分析洞察系列 2-2023.10-13页.pdf

信息资源管理-第三章-数据挖掘.pdf

重要国际学术会议目录-推荐下载.pdf

1-3 从特征交互到数据交互：深度点击率模型的新趋势-张伟楠.pdf

最新资源