762 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 6, NO. 4, OCTOBER 2009
Ensemble Classification Algorithm for Hyperspectral
Remote Sensing Data
Mingmin Chi, Member, IEEE, Qian Kun, Jón Atli Benediktsson, Fellow, IEEE,andRuiFeng
Abstract—In real applications, it is difficult to obtain a suffi-
cient number of training samples in supervised classification of
hyperspectral remote sensing images. Furthermore, the training
samples may not represent the real distribution of the whole space.
To attack these problems, an ensemble algorithm which combines
generative (mixture of Gaussians) and discriminative (support
cluster machine) models for classification is proposed. Experimen-
tal results carried out on hyperspectral data set collected by the
reflective optics system imaging spectrometer sensor, validates the
effectiveness of the proposed approach.
Index Terms—Ensemble classification, hyperspectral remote
sensing images, mixture of Gaussians (MoGs), support cluster
machine (SCM).
I. INTRODUCTION
H
YPERSPECTRAL remote sensing images are very im-
portant for the discrimination of spectrally similar land-
cover classes. In order to obtain a reliable classifier, a large
amount of representative training samples are necessary for hy-
perspectral data compared to multispectral remote sensing data.
In real applications, it is difficult to obtain sufficient number
of training samples for supervised learning. Furthermore, the
training samples may not represent the real distribution of the
whole space. These result in a quantity problem for training
samples in the design of a robust supervised classifier.
In recent years, semisupervised learning (SSL) methods
[1]–[3], usually, have been exploited to overcome the problems
with small numbers of labeled samples for the classification
of hyperspectral remote sensing images, such as self-labeling
approaches [1], low-density separation SSL approaches [2],
and label-propagation SSL approaches [3]. The methods pre-
viously mentioned usually exploit generative or discriminative
approaches, where the estimation criterion is used for adjusting
the parameters and/or structure of the classification approaches.
There is little literature on the use of both generative and dis-
criminative models for the quantity problem. In [4], the authors
Manuscript received December 22, 2008; revised April 10, 2009. First
published July 28, 2009; current version published October 14, 2009. This
work was supported in part by the Natural Science Foundation of China under
Contract 60705008, by the Ph.D. Programs Foundation of the Ministry of
Education of China under Contract 20070246132, and by the Research Fund
of the University of Iceland.
M. Chi, Q. Kun, and R. Feng are with the School of Computer Science,
Fudan University, Shanghai 200433, China (e-mail: mmchi@fudan.edu.cn;
0314018@fudan.edu.cn; fengrui@fudan.edu.cn).
J. A. Benediktsson is with the Faculty of Electrical and Computer Engineer-
ing, University of Iceland, 107 Reykjavik, Iceland (e-mail: benedikt@hi.is).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/LGRS.2009.2024624
worked on a generative model and adopted a discriminative
model to correct the bias of the generative classifier learnt by
small-size training samples.
In this letter, we propose an ensemble algorithm, which
benefits the advantages of both generative and discriminative
models to deal with the quantity problem in the classification
of hyperspectral remote sensing images. In particular, both
labeled and unlabeled data are represented with a generative
model [i.e., mixture of Gaussians (MoGs)]. Then, the estimated
model is used for discriminative learning. This is motivated
by the recently proposed discriminative classification approach,
support cluster machine (SCM) [5]. The SCM was originally
used to address large-scale supervised learning problems. The
main idea in the SCM is that the labeled data are at first mod-
eled using a generative model. Then, the kernel, the similarity
measure between Gaussians, is defined by probability product
kernels (PPKs) [6]. In other words, the obtained PPK kernel
is used to train support vector machines (SVMs) where the
learned models contain support clusters rather than support
vectors (the name SCM is based on this).
In the SCM, the number of clusters is important to obtain
the best classification results. If the selected number of Gaus-
sians (not limited to Gaussians) does not fit the data well, the
classification accuracy can decrease. For a small size training
set problem, the mixture model estimated by only labeled
samples cannot represent the distribution for the whole data.
To attack the aforementioned problem, it is proposed here to
first use both labeled and unlabeled samples to estimate an
MoG. Then, different sets of the MoGs are generated by going
from few (coarse representation) to many (fine representation)
numbers of clusters. Finally, the output classification result is
integrated by an ensemble technique based on the ones obtained
from individual SCMs learnt by different sets of MoGs. In
terms of the different estimated MoGs, the corresponding PPK
kernel matrixes can be computed and used as inputs to standard
SVMs for training. The accuracies and the reliability of the
proposed algorithm have been evaluated on reflective optics
system imaging spectrometer (ROSIS) hyperspectral remote
sensing data collected over the University of Pavia, Italy. The
results are promising when compared to the state-of-the-art
classifiers.
The rest of this letter is organized as follows. The next section
describes the proposed ensemble algorithm with generative/
discriminative models. Section III discusses the data used in
the experiments, reports and discusses the results provided by
the different algorithms. Finally, conclusions and discussion are
given in Section IV.
1545-598X/$26.00 © 2009 IEEE