A Two-Step Similarity Ranking Scheme for Image Retrieval
Di Wu
1, 2
, Jun Wu
3, *
, Ming-Yu Lu
1
, Chun-Li Wang
1
1
School of Information Science and Technology, Dalian Maritime University, China
2
Software Technology Institute, Dalian Jiaotong University, China
3
School of Computer and Information Technology, Beijing Jiaotong University, China
wudi@qq.com, wuj@bjtu.edu.cn, lumingyu@dlmu.edu.cn, wangcl@dlmu.edu.cn
Abstract—similarity ranking is one of the keys of a content-
based image retrieval (CBIR) system. Among various methods,
manifold ranking (MR) is popular for its application to
relevance feedback in CBIR. Most existing MR methods only
take the visual features into account in the similarity ranking,
however, which is not accurate enough to reflect the intrinsic
semantic structure of a given image database. In this paper, we
propose a two-step similarity ranking scheme that aims to
preserve both visual and semantic resemblance in the
similarity ranking. Concretely, in the first step it derives an
initial visual-based similarity rank through a self-tuning MR
solution. In particular, the Gaussian kernel used in our scheme
is refined by using a point-wise bandwidth. In the second step,
the rank of each database image is further adjusted to achieve
semantic consistency by mining the query log. An empirical
study shows that using two-step similarity ranking in CBIR is
beneficial, and the proposed scheme is more effective than
some existing MR approaches.
Keywords - image retrieval; relevance feedback; similarity
ranking; manifold ranking
I. INTRODUCTION
Content-Based Image Retrieval (CBIR), as an effective
supplementary means of traditional Web search, has drawn
substantial research attention in recent years [1]. Rather than
using text, in a CBIR system, a search task may be initiated
using a query image that is posed by the user to convey a
certain query concept. Then system ranks the database
images according to their similarities to the query concept,
based on the visual features automatically extracted from
images. However, the low-level visual features are
insufficient to characterize the high-level semantics, i.e. the
so-called semantic gap. Relevant feedback has been shown
as a powerful tool to bridge this gap by exploiting the user’s
interaction with CBIR system [19]. During the interaction,
the user is encouraged to label a few images returned as
either positive or negative in terms of whether they are
relevant to the query concept or not, and the labeled images
are then given to the system as additional query examples so
that the ranking results can be refined. In essence, searching
the query concept through relevance feedback can be
regarded as a similarity learning problem, i.e. the system ties
to learn an appropriate similarity measure through relevance
feedback, in order to understand the user’s information needs.
Some recent studies along this direction [3, 18, 22, 2, 5, 10
*
Corresponding author
and 11] focus on learning similarity measures based on the
online feedback information, which is called short-term
learning, while others [15, 16, 4, 8, 12 and 13] aims to
achieve it by using both online and historical feedback
sessions, named the long-term learning.
In the meantime, a surge of efforts have been made in
theory for the graph-based learning, especially in manifold
ranking (MR) [20, 21]. By taking the intrinsic geometrical
structure into account, MR assigns each data point a relative
ranking score, instead of an absolute pairwise similarity as
traditional ways. The score is treated as a distance metric
defined on the data manifold, which is more meaningful to
capture the similarities among data points. Previous studies
have shown that MR is one of the most promising and
successful approaches for image retrieval with relevance
feedback [3, 9, 11, 13 and 14]. Specifically, in [13], both
short-term and long-term feedback experiences are integrated
into a unified MR framework.
Despite the success, the performance of existing MR
methods is sensitive to the “strange points
1
”. MR is based on
an assumption that the data points lying in a same local
neighborhood share some common property and thus they
may have the same class label. As a result, a MR-like
algorithm tries to propagate the ranking score from each
labeled data point to its (mostly unlabeled) neighbor points.
However, due to the semantic gap, this assumption cannot
hold in the CBIR context. As illustrated by Figure 1, given
the query shown in 1a, and two groups of images shown in
1b and 1c, we assume that the central points in group-1 and
group-2 are labeled as positive and negative respectively.
Most neighbor points in group-1 are relevant to the query
except the strange point is irrelevant, while most neighbor
points in group-2 are irrelevant to the query except the
strange point is relevant. Based on the operation principle of
MR, the strange point in group-1 (query irrelevant) is
wrongly ranked in front of the strange point in group-2
(query relevant). We refer to this scenario as “dislocation”
problem.
In this paper, we propose a Two-Step Similarity Ranking
scheme, TS2R for short, in order to address the challenge
mentioned above. Compared with the short-term MR
methods, such as [3], both online and historical feedback
experiences are exploited in TS2R to elevate the retrieval
effectiveness. In contrast to the long-term MR methods, such
1
In this paper, a “strange” point refers to the data point with
different semantic comparing with its neighbor points. For
example, the points labeled as “o” in Figure 1b and 1c.
2014 Sixth International Symposium on Parallel Architectures, Algorithms and Programming
2168-3034/14 $31.00 © 2014 IEEE
DOI 10.1109/PAAP.2014.26
191