Song et al. / Front Inform Technol Electron Eng 2016 17(9):897-906 897
Frontiers of Information Technology & Electronic Engineering
www.zju.edu.cn/jzus; engineering.cae.cn; www.springerlink.com
ISSN 2095-9184 (print); ISSN 2095-9230 (online)
E-mail: jzus@zju.edu.cn
Two-level hierarchical feature learning for
image classification
∗
Guang-hui SONG
1,2
, Xiao-gang JIN
†‡1
,Gen-langCHEN
2
,YanNIE
3
(
1
College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China)
(
2
Ningbo Institute of Technology, Zhejiang University, Ningb o 315100, China)
(
3
College of Science and Technology, Ningbo University, Ningb o 315100, China)
†
E-mail: xiaogangj@cise.zju.edu.cn
Received Oct. 20, 2015; Revision accepted Apr. 10, 2016; Crosschecked Aug. 8, 2016
Abstract: In some image classification tasks, similarities among different categories are different and the samples
are usually misclassified as highly similar categories. To distinguish highly similar categories, more specific features
are required so that the classifier can improve the classification performance. In this paper, we propose a novel
two-level hierarchical feature learning framework based on the deep convolutional neural network (CNN), which is
simple and effective. First, the deep feature extractors of different levels are trained using the transfer learning
method that fine-tunes the pre-trained deep CNN model toward the new target dataset. Second, the general feature
extracted from all the categories and the specific feature extracted from highly similar categories are fused into a
feature vector. Then the final feature representation is fed into a linear classifier. Finally, experiments using the
Caltech-256, Oxford Flower-102, and Tasmania Coral Point Count (CPC) datasets demonstrate that the expression
ability of the deep features resulting from two-level hierarchical feature learning is powerful. Our proposed method
effectively increases the classification accuracy in comparison with flat multiple classification methods.
Key words: Transfer learning, Feature learning, Deep convolutional neural network, Hierarchical classification,
Spectral clustering
http://dx.doi.org/10.1631/FITEE.1500346 CLC number: TP391.4
1 Introduction
The deep convolutional neural network (CNN)
has achieved impressive classification performance in
the ImageNet benchmark (Krizhevsky et al., 2012).
Surprisingly, transfer learning methods based on the
deep convolutional feature trained on a generic recog-
nition task are also successful in various computer
vision tasks, such as object classification, domain
adaptation, and scene recognition. They achieve
results superior to those of the previous meth-
‡
Corresponding author
*
Project supported by the National Natural Science Foundation
of China (No. 61379074) and the Zhejiang Provincial Natural Sci-
ence Foundation of China (Nos. LZ12F02003 and LY15F020035)
OR CID: Xiao-gang JIN, http://orcid.org/0000-0002-7787-7228
c
Zhejiang University and Springer-Verlag Berlin Heidelberg 2016
ods (Donahue et al., 2014; Zeiler and Fergus, 2014;
Cai et al., 2015). Therefore, the feature learning
ability of deep CNN has received considerable at-
tention. In previous studies, deep CNN models were
used as feature extractors but not as classifiers, and
they provided a way to obtain more specific visual
features (Yosinski et al., 2014).
At present, most deep CNN models serve as
flat end-to-end classifiers for image recognition tasks.
These deep models take the raw image as the network
input, extract image features using back-propagation
through layers of convolutional filters, and finally
output the categorized results using a softmax out-
put layer. However, the reality is that image datasets
have a growing sample size and image category. Simi-
larities are different among different categories, with