Knowledge-Based Systems 104 (2016) 52–61
Contents lists available at ScienceDirect
Knowle dge-Base d Systems
journal homepage: www.elsevier.com/locate/knosys
Multi-label learning with label-specific feature reduction
Suping Xu
a
,
e
,
f
, Xibei Yang
a
,
b
,
e
,
f
,
∗
, Hualong Yu
a
, Dong-Jun Yu
c
, Jingyu Yang
c
, Eric C.C. Tsang
d
a
School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, PR China
b
School of Economics and Management, Nanjing University of Science and Technology, Nanjing 210094, PR China
c
Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information, Nanjing University of Science and Technology, Ministry of
Education, Nanjing 210094, PR China
d
Faculty of Information Technology, Macau University of Science and Technology, 519020, Macau
e
Intelligent Information Processing Key Laboratory of Shanxi Province, Shanxi University, Taiyuan 030 0 06, PR China
f
Key Laboratory of Oceanographic Big Data Mining and Application of Zhejiang Province, Zhejiang Ocean University, Zhoushan 316022, PR China
a r t i c l e i n f o
Article history:
Received 22 December 2015
Revised 10 March 2016
Accepted 13 April 2016
Available online 25 April 2016
Keywords:
Feature reduction
Fuzzy rough set
Label-specific feature
Multi-label learning
Sample selection
a b s t r a c t
In multi-label learning, since different labels may have some distinct characteristics of their own, multi-
label learning approach with label-specific features named LIFT has been proposed. However, the con-
struction of label-specific features may encounter the increasing of feature dimensionalities and a large
amount of redundant information exists in feature space. To alleviate this problem, a multi-label learning
approach FRS-LIFT is proposed, which can implement label-specific feature reduction with fuzzy rough
set. Furthermore, with the idea of sample selection, another multi-label learning approach FRS-SS-LIFT
is also presented, which effectively reduces the computational complexity in label-specific feature reduc-
tion. Experimental results on 10 real-world multi-label data sets show that, our methods can not only
reduce the dimensionality of label-specific features when compared with LIFT, but also achieve satisfac-
tory performance among some popular multi-label learning approaches.
©2016 Elsevier B.V. All rights reserved.
1. Introduction
Nowadays, multi-label learning problem has received an in-
creased attention in real-world applications. For example, in se-
mantic annotation of images [3,16,26,49] , a picture can be an-
notated as camel, desert and landscape. In text categorization
[5,11,17,29] , a document may belong to several given topics, includ-
ing economics, finance or GDP. In bioinformatics [6,13,50] , each
gene may be associated with a set of functional classes, such as
metabolism, transcription and protein synthesis. In all cases above,
each sample may be associated with more than one label simulta-
neously and predefined labels for different samples are not mutu-
ally exclusive but may overlap. This situation is distinct from the
traditional single-label learning where predefined labels are mutu-
ally exclusive, each sample only belongs to a single label.
Over the last decade, many multi-label learning approaches
have been witnessed [12,28,58] . Generally, the existing methods
can be grouped into two main categories [43] , i.e., algorithm
∗
Corresponding author at: School of Computer Science and Engineering, Jiangsu
University of Science and Technology, Zhenjiang 212003, PR China.
E-mail addresses: supingxu@yahoo.com (S. Xu), yangxibei@hotmail.com
(X. Yang), yuhualong@just.edu.cn (H. Yu), njyudj@njust.edu.cn (D.-J. Yu),
yangjy@mail.njust.edu.cn (J. Yang), cctsang@must.edu.mo (E.C.C. Tsang).
adaptation methods and problem transformation methods. Algo-
rithm adaptation methods extend specific single-label learning al-
gorithms to directly handle multi-label data by modifying some
constraint conditions, such as AdaBoost.MH [40] , ML- k NN [59] ,
MLNB [60] , and RankSVM [9] . Problem transformation methods,
transform the multi-label task into one or more corresponding
single-label ones and then handle them one by one through tra-
ditional methods. The well-known problem transformation meth-
ods include binary relevance (BR), label power set (LP) and pruned
problem transformation (PPT). BR [3] learns a binary classifier for
each label independently and predicts each of the labels separately,
so it cuts up the relationship among different labels. LP [44] con-
siders each unique set of labels that exists in a multi-label train-
ing set as a new single-label multi-value class. Though this method
considers the correlations among different labels, it easily leads to
a higher time consumption since the number of new classes is
increased exponentially with the increasing of labels. Meanwhile,
some new classes created by a few samples may lead to class un-
balance problem. PPT [34] abandons the new classes associated
with extremely small number of samples or assigns these sam-
ples with new labels that can create accepted classes, while some
abandoned classes will lead to the loss of multi-label informa-
tion. Although above methods have achieved good performance in
multi-label learning, they make use of the same features to achieve
http://dx.doi.org/10.1016/j.knosys.2016.04.012
0950-7051/© 2016 Elsevier B.V. All rights reserved.