978-1-4799-5458-2 ©2014 IEEE 780
2014 2nd International Conference on Systems and Informatics (ICSAI 2014)
Image Retrieval Based on Color-Spatial Histograms
Shan Zeng
College of Mathematics and Computer Science
Wuhan Polytechnic University
Wuhan, Hubei, China
Jun Bai
College of Mathematics and Computer Science
Wuhan Textile University
Wuhan, Hubei ,China
Rui Huang
School of Automation
Huazhong University of Science and Technology
Wuhan, Hubei,China
Abstract—A novel image retrieval method using color-spatial
histograms is presented in this paper. The new method contains
three phases. First, we quantize the color space by Gaussian
mixture model. Second, the color-spatial histograms generated
for both the query image and the images in the database based
on the quantized color components and the improved spatiogram
method. Finally, using color-spatial histogram as features, we
compare the query image with images in the database by
Jensen-Shannon divergence. Our proposed method’s availability
is demonstrated by the experimental assessment on the COREL
image database.
Keywords-Image retrieval, Gauss mixture model, Color-spatial
histogram, Jensen- Shannon divergence
I. INTRODUCTION
Color-histogram as the representative of image (region)
has been widely applied in processing a variety of computer
vision problems. However, there are still some open problems
in Color-histograms. First of all, for the processing of general
high resolution, a 224-dimensional vector is used to represent
the histogram of RGB color space with [255, 255,255] range
of three axes, a 3600000-dimensional vector for HSV color
space with [360,100,100] range of three axes, a 9274800-
dimensional vector for LUV space with [100,354,262] range
of three axes. These high feature representations are too large
and unnecessary for image retrieval. In order to reduce the
histogram bins effectively (for example results [1] show that
color histogram bins for image retrieval can be decreased to
the 128), we can first quantize the color space and then use
the quantized color components to obtain histogram. Recently,
many researchers have focused their efforts on the methods of
quantizing color spaces for histogram generation. For each
color channel of every pixel, it introduced a uniform
quantization method which is simple and popular [2].
K-means [3], competitive learning [4], fuzzy c-means [5] and
self-organizing maps [6] are also typical algorithms as color
quantization methods.
Another important problem in Color-histograms is not to
consider any spatial information in the image representation.
Based on two spatially different images might have the same
color histograms, as shown in Fig.1 (a)~(b), for image
retrieval, histogram generation not only considers the color
quantization, but also needs to consider any actual color
This work is supported by National Natural Science Foundation of
China under Grants 61303116.
distributions in a given image. To overcome this limitation of
conventional color histograms, many researchers have
incorporated local spatial information into color quantization
methods to improve the histograms performance. Compared
with traditional quantization methods, Gauss mixture vector
quantization (GMVQ) introduced in [1] provide a better way
of exploiting the spatial features of the images. However, in
GMVQ algorithm, the number of color components needs to
be set in advance. Spatial histogram (spatiogram) used to
reflect the spatial information was proposed by Birch¿eld et al.
[7]. However, as shown in Fig. 1 (c)~(d), if the bin of cyan is
same with the blue, the spatiograms of these two blocks are
same as a result despite different color patterns.
To retain the advantages and to overcome the drawbacks
of GMVQ algorithm and spatiogram method, we present a
new color quantization method and a new histogram
generation method for incorporating spatial information into
the retrieval framework. The Expectation-Maximization (EM)
algorithm with Bayesian Information Criterion (BIC) [8] for
Gaussian mixture model (GMM), which is named EM-BIC,
as a new color quantization method can obtain the number of
the quantized color components automatically. At the same
time, we implement EM-BIC algorithm on HSV color space
directly. Being different from the spatiogram was proposed
by Birchfield et al. [7] , our color-spatial histogram contains
respectively the number of pixels, the mean vector of
locations, and covariance matrix of locations of pixels that
are part of the quantized color components by the EM-BIC
algorithm.
Using color-spatial histogram as features, the next issue is
to improve distance measure between them in image retrieval
problems. Various distance measures aimed to histograms
have been offered [9] (such as the Euclidean distance and
histogram intersection). The primary difficulty with these
methods is how to apply them to color-spatial histogram. In
oder to conquer the drawback in [7], Conaire et al. [10]
introduced an improved method based on Bhattacharyya
coefficient, but the Bhattacharyya coefficient based measure
shows bad discriminative power [11]. Inspired by the
symmetric KL divergence [12], we use the Jensen-Shannon
Divergence, which is a popular distance measure in
information theory, to complete our distance measure.
The proposed method’s contributions include the new
color quantization method by EM-BIC algorithm, the revised
color-spatial histogram, and the distance measure of color-
spatial histogram that is based on Jensen-Shannon divergence.
All these characteristics make our method more robust and