Ranking Measures and Loss Functions
in Learning to Rank
Wei Chen
∗
Chinese Academy of sciences
chenwei@amss.ac.cn
Tie-Yan Liu
Microsoft Research Asia
tyliu@micorsoft.com
Yanyan Lan
Chinese Academy of sciences
lanyanyan@amss.ac.cn
Zhiming Ma
Chinese Academy of sciences
mazm@amt.ac.cn
Hang Li
Microsoft Research Asia
hangli@micorsoft.com
Abstract
Learning to rank has become an important research topic in machine learning.
While most learning-to-rank methods learn the ranking functions by minimizing
loss functions, it is the ranking measures (such as NDCG and MAP) that are used
to evaluate the performance of the learned ranking functions. In this work, we
reveal the relationship between ranking measures and loss functions in learning-
to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE. We
show that the loss functions of these methods are upper bounds of the measure-
based ranking errors. As a result, the minimization of these loss functions will lead
to the maximization of the ranking measures. The key to obtaining this result is to
model ranking as a sequence of classification tasks, and define a so-called essen-
tial loss for ranking as the weighted sum of the classification errors of individual
tasks in the sequence. We have proved that the essential loss is both an upper
bound of the measure-based ranking errors, and a lower bound of the loss func-
tions in the aforementioned methods. Our proof technique also suggests a way to
modify existing loss functions to make them tighter bounds of the measure-based
ranking errors. Experimental results on benchmark datasets show that the modifi-
cations can lead to better ranking performances, demonstrating the correctness of
our theoretical analysis.
1 Introduction
Learning to rank has become an important research topic in many fields, such as machine learning
and information retrieval. The process of learning to rank is as follows. In training, a number of
sets are given, each set consisting of objects and labels representing their rankings (e.g., in terms of
multi-level ratings
1
). Then a ranking function is constructed by minimizing a certain loss function
on the training data. In testing, given a new set of objects, the ranking function is applied to produce
a ranked list of the objects.
Many learning-to-rank methods have been proposed in the literature, with different motivations and
formulations. In general, these methods can be divided into three categories [3]. The pointwise
approach, such as subset regression [5] and McRank [10], views each single object as the learn-
ing instance. The pairwise approach, such as Ranking SVM [7], RankBoost [6], and RankNet [2],
regards a pair of objects as the learning instance. The listwise approach, such as ListNet [3] and
∗
T
he work was performed when the first and the third authors were interns at Microsoft Research Asia.
1
In information retrieval, such a label represents the relevance of a document to the given query.
1