MULTI-LABEL ACTIVE LEARNING FOR IMAGE CLASSIFICATION WITH
ASYMMETRICAL CONDITIONAL DEPENDENCE
Jian Wu
1,*
, Shiquan Zhao
1
, Victor S. Sheng
2
, Pengpeng Zhao
1
, Zhiming Cui
1
1
The Institute of Intelligent Information Processing and Application, Soochow University
Suzhou 215006, China
* jianwu@suda.edu.cn
2
Department of Computer Science, University of Central Arkansas
Conway 72035, USA
ABSTRACT
Image classification is a hot topic of pattern recognition in
computer vision. In order to achieve high accuracy of
classification, a certain amount of high quality pictures are
needed. As a matter of fact, high quality pictures are scarce.
Active learning can solve such a problem. Label
dependences play an important role in multi-label active
learning for image classification. The interdependences
between different labels are usually different and
asymmetrical. This paper first brings the asymmetrical
conditional label dependences into a novel active learning
method for multi-label image classification based on the
asymmetrical conditional label dependence, called ACDAL.
Our extensive experimental results on three image and two
non-image datasets show that our new approach ACDAL
significantly outperforms existing approaches.
Index Terms—Active learning, multi-label image
classification, label correlation, asymmetrical label
dependence
1. INTRODUCTION
In the field of traditional image classification, each image
can only be classified into one class, which is well-known as
single label classification, specifically including binary and
multi-class classification. However, in real -world
applications, images are always simultaneously associated
with multiple labels according to their content [1]. In order
to train a multi-label image classification model, we need
have a reasonable amount of high quality labeled images.
However, it is common that we don’t have enough such
___________________________
This research was partially supported by the Natural Science
Foundation of China under grant No.61402311 and 61440053,
Jiangsu Province Colleges and Universities Natural Science
Research Project under grant No.13KJB520021, and the U.S.
National Science Foundation (IIS-1115417).
978-1-4799-7082-7/15/$31.00 ©2016 IEEE
images in many real-world applications. Furthermore, the
cost of acquiring high quality labeled images is very high,
especially when each image is associated with a great deal
of labels. For the sake of reducing the effort and the
corresponding cost, active learning methods have been
widely used in this field. The main purpose of active
learning is to obtain the best model with the least labeling
effort and cost. As we know, if all labels are independent in
a multi-label image classification problem, this kind of
problems can be easily solved by decomposing into a series
of binary classification problems, which is known as
problem transformation. However, it is common that labels
are always correlated in multi-label classification.
Obviously, this simple decomposition ignores the
correlations among labels. Studies have shown that these
correlations are useful in active learning and classification
processes [2].
Active learning for object classification is widely
studied, which aims to obtaining a high quality model by
querying a minimum number of samples, so as to reduce
labeling effort and cost. Given a large pool of unlabeled
samples, an active learner can iteratively select the most
informative example from the pool to query an oracle or
human annotator for its true label(s). This is the well-known
pool based sampling [3]. Throughout all the research results
on multi-label active learning, all of them could be divided
into two major categories: example based sampling
strategies and example-label pair based sampling strategies.
The example based sampling strategies take each example
as a sampling unit and measure the informativeness of each
complete sample. The example-label pair based strategies
take each example-label pair as a sampling unit, and
measure the informativeness of each example and the
informativeness of each label of this example. An intelligent
example-label pair based sampling strategy is able to reduce
the labeling effort to a great extent, especially when an
example is associated with a great number of labels.
There are several articles on active learning for multi-
label image classification. Li et al. [4] first brought active