Classification of hyperspectral remote-sensing data with primal SVM
for small-sized training dataset problem
q
Mingmin Chi
a,
*
, Rui Feng
a
, Lorenzo Bruzzone
b
a
Department of Computer Science and Engineering, Fudan University, 220 Han Dan Road, Shanghai 200433, China
b
Department of Information and Communication Technologies, University of Trento, Italy
Received 1 November 2006; received in revised form 3 February 2008; accepted 6 February 2008
Abstract
With recent technological advances in remote sensing, very high-dimensional (hyperspectral) data are available for a better discrim-
ination among different complex land-cover classes having similar spectral signatures. However, this large number of bands makes very
complex the task of automatic data analysis. In the real application, it is difficult and expensive for the expert to acquire enough training
samples to learn a classifier. This results in a classification problem with small-size training sample set. Recently, a regularization-based
algorithm is usually proposed to handle such problem, such as Support Vector Machine (SVM), which usually are implemented in the
dual form with Lagrange theory. However, it can be solved directly in primal formulation. In this paper, we introduces an alternative
implementation technique for SVM to address the classification problem with small-size training sample set. It has been empirically pro-
ven that the effectiveness of the introduced implementation technique which has been evaluated by benchmark datasets.
Ó 2008 COSPAR. Published by Elsevier Ltd. All rights reserved.
Keywords: Primal Support Vector Machine (SVM); Classification; Small-size training dataset problem; Hyperspectral remote-sensing data
1. Introduction
One of the most critical problems relating to the super-
vised classification of remote-sensing images lies in the def-
inition of a proper size of training set for an accurate
learning of classifiers. Since the collection of ground-refer-
ence data is an expensive and complex task, in many cases
the number of train ing samples is insufficient for a proper
learning of classification systems. This issue is particularly
critical when hyperspectral images are considered. Such
hyperspectral data are generally made of about 100–200
spectral channels of relatively narrow bandwidths (5–
10 nm). Although high-dimensional features are capable
of better discriminating among the complex (sub)classes,
in the real application, it is difficult and expensive for
experts to acquire enough training samples to learn a clas-
sifier. Consequently, it is impossible to meet the require-
ments on the necessary number of train ing samples since
the size of training dataset is relatively fixed.
When the number of (representative) training samples is
relatively small with respect to the number of features (and
thus of classifier parameters to be estimated), the well-
known problem of the curse of dimensionality (i.e., the
Hughes phenomenon Hughes, 1968)
1
occurs. This results
in the risk of overfitting of the training data and can lead
to poor generalization capabilities of the classifier. Conven-
tional classification methods, such as the Gaussian Max i-
mum Likelihood algorithm, cannot be applied to
hyperspectral data due to the high dimensionality of the
0273-1177/$34.00 Ó 2008 COSPAR. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.asr.2008.02.012
q
Expanded version of a talk presented at COSPAR on terrestrial
phenomena and land products from space: validation, application and
perspectives (Beijing, China, July 2006).
*
Corresponding author. Tel.: +86 21 5566228.
E-mail addresses: mmchi@fudan.edu.cn (M. Chi), fengrui@fudan.
edu.cn (R. Feng), lorenzo.bruzzone@ing.unitn.it (L. Bruzzone).
1
With more disc riminative featu res, classification performance is
improved with the increase of the number of labeled samples; if the
number of labeled samples is fixed, the performance otherwise decreases.
www.elsevier.com/locate/asr
Available online at www.sciencedirect.com
Advances in Space Research 41 (2008) 1793–1799