提升LambdaMART的LambdaXGB算法研究与性能验证

44 浏览量更新于2024-08-28 收藏 193KB PDF 举报

本文主要探讨了在学习排名（Learning to Rank）领域中的四个关键算法：RankNet、LambdaRank、LambdaMART以及XGBoost的优化方法。RankNet作为基础，它是一种早期的基于梯度提升的排序模型。LambdaRank是LambdaMART的前身，它通过改进了对损失函数的处理，引入了对查询依赖性的考虑，提高了模型的准确性。 LambdaMART在此基础上进一步发展，它结合了线性模型和梯度提升，尤其在信息检索和推荐系统中表现出色。LambdaMART的损失函数通常采用的是对数损失，但为了增强模型的泛化能力和防止过拟合，论文提出了一种创新的方法，即在LambdaMART的损失函数中加入正则化项，这里研究了两种常见的正则化方式：L1和L2正则化。因此，作者提出了LambdaXGB L1、LambdaXGB L2和LambdaXGB三个新的算法变体，它们分别是在LambdaMART的基础上融入了L1和L2正则化的XGBoost版本。XGBoost以其高效且可扩展的梯度提升框架而闻名，其集成方法能有效捕捉特征间的交互关系。文章通过MQ2008数据集进行实验，对比了新提出的LambdaXGB算法与RankNet和LambdaMART在Normalized Discounted Cumulative Gain (NDCG)指标上的性能。NDCG是一种常用的评价排序算法准确性的评价指标，它考虑了排名列表的整体质量，而非仅仅关注前几项的准确性。通过实验结果，论文验证了这些新算法在实际应用中的有效性，证明了添加正则化到LambdaMART损失函数可以提升模型的稳健性和预测能力。这对于理解和改进现有的学习排名算法，尤其是在处理大规模数据和高维特征时，具有重要的理论和实践价值。这篇研究为提高学习排名模型的性能提供了一个新的视角和策略，对信息检索和推荐系统的发展具有推动作用。

ICIC Express Letters

Part B: Applications ICIC International

2017 ISSN 2185-2766

Volume 8, Number 8(tentative), August 2017 pp. 1–ICICELB-1703-019

LAMBDAXGB: RESEARCH ON LEARNING TO RANK

BASED ON LAMBDAMART

Liyan Xiong

, Xiaoxia Chen

, Xiaohui Huang

, Weichun Huang

Maosheng Zhong

and Hui Zeng

Scho ol of Information Engineering

School of Software Engineering

East China Jiaotong University

No. 808, Shuanggang East Avenue, Nanchang 330013, P. R. China

{ xly ecjtu; chenxiao970508; hwc1968 }@163.com; hxh016@hotmail.com

zhongmaosheng@sina.com; 331549185@qq.com

Received March 2017; accepted May 2017

Abstract. In this paper, RankNet, LambdaRank, LambdaMART and the XGBoost are

studied and analyzed. The idea of improving the LambdaMART is proposed, that is,

adding the regulation to the loss function of LambdaMART. Two commonly regulation-

s L1 and L2 are added to the loss function of the LambdaMART and three algorithms

are proposed, including LambdaXGB L1, LambdaXGB L2 and LambdaXGB. Through the

MQ2008 dataset, this paper reveals the NDCG evaluation result compared with RankNet

and LambdaMART, and veriﬁes the eﬀectiveness of these algorithms. The results demon-

strate that our approach gives state-of-the-art results on a rank of dataset.

Keywords: RankNet, LambdaMART, LambdaXGB, LambdaXGB L1, LambdaXGB

1. Introduction. With the increasing selection, search engines and recommendation sys-

tems are more and more dependent on the sort. However, single factor is only considered

by the traditional sorting algorithm. With the exponential growth of processed data, mul-

tiple factors need to be combined for sorting, endowed with diﬀerent weights. So that is

something about Learning to Rank [1]. Learning to Rank is a sort of supervised learning

method, which can get a rank model according to the training data, and then use this

rank model to sort the data.

The pairwise is transformed into binary classiﬁcation problem in ranking the documents.

For the documents of the same query, the training samples of binary classiﬁer training

are obtained for any two diﬀerent labels. All the document pairs are sorted to get a

partial order, and the ﬁnal rank is achieved. The pairwise approach includes RankNet,

Lamb daRank, LambdaMART, Ranking SVM, IR SVM, RankBoost.

This work contributes to the follow aspect: distinguished from the existing method

of LambdaMART, we add the regulation to the loss function of LambdaMART to build

the new models, including LambdaXGB L1, LambdaXGB L2 and LambdaXGB. And the

experiment demonstrates that our approach gives state-of-the-art results on a rank of

problems.

The rest part of the paper is structured as follows. We discuss related work in Section

2. We discuss the LambdaXGB model in Section 3. The experiment and result analysis

are shown in Section 4. Finally, we conclude the paper and discuss the directions of the

future works in Section 5.

2. Related Work. The RankNet is an underlying model, which maps an input feature

vector to a number during training. For a given query, inputting the document of the

query, there is output ranking model f(d, w). The cross entropy cost function penalizes

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38645373

粉丝: 4
资源: 958

提升LambdaMART的LambdaXGB算法研究与性能验证

Research on Indoor Location Algorithm Based on Wi-Fi

Tom.Mitchell.-.Machine.Learning

Springer-Modern.Multivariate.Statistical.Techniques.Regression.classification.and.manifold.learning.(2008)

Statistical Tests for Model Evaluation: Using Hypothesis Testing to Compare Models

MATLAB Genetic Algorithm Parallel Computing: The Secret Weapon to Unlock Computational Potential and...

Deep Learning Model Compression Techniques: How to Reduce Model Size While Maintaining Performance

Feature Selection: Master These 5 Methodologies to Revolutionize Your Models

Assessment Challenges in Multi-label Learning: Detailed Metrics and Methods

MATLAB Genetic Algorithm and Deep Learning Integration Guide: Empowering Complex Optimization Tasks,...

【Algorithm Optimization】: GAN Training Efficiency Enhancement Guide: Quickly Build Efficient AI ...

最新资源