A Novel Visual-Region-Descriptor-based Approach to
Sketch-based Image Retrieval
Cheng Jin, Zheming Wang, Tianhao Zhang, Qinen Zhu, Yuejie Zhang
School of Computer Science, Shanghai Key Laboratory, of Intelligent Information Processing
Fudan University, Shanghai 200433, China
{jc, 13210240035, 14210240058, 13210240131, yjzhang}@fudan.edu.cn
ABSTRACT
A novel Visual-Region-Descriptor-based approach is developed
in this paper to facilitate more effective Sketch-based Image
Retrieval (SBIR), which can be treated as a problem of bilateral
visual mapping and modeled as an inter-related correlation
distribution over visual semantic representations of sketches and
images. For crossing the matching barrier between binary query
sketches and full color natural images, we focus on constructing a
visual pre-analysis via the sketch-like representation
transformation to improve the general sketch-image resemblance,
creating a special visual region descriptor to obtain better visual
feature generation for sketches and images, and a dynamic sketch-
image matching scheme to achieve more precise characterization
of the correlations between sketches and images. Such a visual-
region-descriptor-based SBIR pattern can not only enable users to
present whatever they imagine in their mind on the sketch query
panel but also return the most similar images to the picture in
users’ mind. Very positive results were obtained in our
experiments using a large quantity of public data.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information Search
and Retrieval – Search Process.
Keywords
Sketch-based image retrieval; visual region descriptor; sketch-like
representation; feature generation; matching; dynamic gridding
1. INTRODUCTION
With the explosive growth of images available both online and
offline, how to achieve more effective image retrieval has become
an important research focus [1, 2]. However, an image user’s
query intention is always a little complicated, which cannot be
easily formulated as an exact keyword text query or similar image
query that could meet his/her real needs. In specific scenarios
searching images by text or image query will return frustrating
and dubious results [3, 4]. Thus a natural solution of Sketch-based
Image Retrieval (SBIR) emerged as a more accurate and
convenient way than the traditional text/content-based manners,
which enables users to flexibly express what they want with hand-
drawn strokes of imaginary pictures.
Although SBIR has been extensively studied since recent years, it
still remains the necessity of optimal solutions and three inter-
related issues should be addressed simultaneously: 1) in-depth
visual analysis to improve the resemblance between hand-drawn
sketches and natural images; 2) intermediate descriptor generation
to bridge the representational gap between sketches and images;
and 3) pairwise sketch-image matching optimization to identify
better correspondences between sketches and images. To address
the first issue, it’s very important to establish a robust visual pre-
analysis mechanism that can transform a natural image into a
sketch-like representation more reasonably. To address the second
issue, it’s critical to explore an optimal descriptor that can achieve
more precise and comprehensive visual feature expression for
both sketches and images. To address the third issue, it’s
significant to develop an appropriate matching scheme with high
accuracy but low cost, which can efficiently exploit correlations
among visual attributes of sketches and images.
Based on these observations, a novel Visual-Region-Descriptor-
based approach is developed in this paper to facilitate more
effective SBIR. Our scheme significantly differs from other
earlier work in: a) The visual pre-analysis via the sketch-like
representation transformation for images is constructed based on
the globalPb contour detector and edge pixel screening, in which
the edge points that humans find important on the natural images
are preserved to greatly improve the general sketch-image
resemblance. b) The visual region descriptor is created for
encoding both the local and global features in sketches and
images to obtain better visual feature generation for both, in
which an adaptive weighting quantization is especially proposed
to quantify local features as discrete types and the global spatial
location is encoded from a series of sub-regions based on a novel
dynamic grid algorithm. c) A dynamic sketch-image matching
scheme is designed to achieve more precise characterization of
the correlations between binary query sketches and full color
images, in which each sketch or image is partitioned into a series
of non-equal-sized rectangular regions and a global matching
kernel is particularly constructed as a weighted sum of the
separate region kernels. d) A new real-time SBIR framework is
built by fusing the above pre-analysis, representation and
matching patterns, which not only enables users to present on the
sketch query panel whatever they imagine in their mind, but also
returns the most similar images to the picture in users’ mind. Such
a visual-region-descriptor-based SBIR pattern can be modeled as
an inter-related correlation distribution over visual semantic
representations of sketches and images, in which the most
important is to create more effective visual feature association and
measure what degree they are related. Our experiments on a large
number of public data have obtained very positive results.
2. RELATED WORK
SBIR has first received attention from 1990 and arguably began
to gain momentum in the mid-nineties, but little progress has been
made in the past decade [5, 6]. Earlier research is typically driven
by queries comprising blobs of color or predefined texture, and
later augmented with shape and spectral descriptors [7, 8]. To
achieve the interactive response, it is impossible to compare the
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed for
profit or commercial advantage and that copies bear this notice and the full citation on
the first page. Copyrights for components of this work owned by others than ACM must
be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to
post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
Request permissions from Permissions@acm.org.
ICMR '15, June 23 - 26, 2015, Shanghai, China
© 2015 ACM. ISBN 978-1-4503-3274-3/15/06…$15.00
DOI: htt
://dx.doi.or
/10.1145/2671188.2749302