Bagging-like metric learning for support vector regression
Peng-Cheng Zou
⇑
, Jiandong Wang, Songcan Chen
*
, Haiyan Chen
College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
article info
Article history:
Received 26 December 2013
Received in revised form 28 February 2014
Accepted 2 April 2014
Available online 19 April 2014
Keywords:
Distance metric learning
Support vector regression
Ensemble learning
Bagging
Distance-based kernel
abstract
Metric plays an important role in machine learning and pattern recognition. Though many available off-
the-shelf metrics can be selected to achieve some learning tasks at hand such as for k-nearest neighbor
classification and k-means clustering, such a selection is not necessarily always appropriate due to its
independence on data itself. It has been proved that a task-dependent metric learned from the given data
can yield more beneficial learning performance. Inspired by such success, we focus on learning an embed-
ded metric specially for support vector regression and present a corresponding learning algorithm
termed as SVRML, which both minimizes the error on the validation dataset and simultaneously enforces
the sparsity on the learned metric matrix. Further taking the learned metric (positive semi-definite
matrix) as a base learner, we develop a bagging-like effective ensemble metric learning framework in
which the resampling mechanism of original bagging is specially modified for SVRML. Experiments on
various datasets demonstrate that our method outperforms the single and bagging-based ensemble
metric learnings for support vector regression.
Ó 2014 Elsevier B.V. All rights reserved.
1. Introduction
Metric learning plays an important role in many learning tasks
including k-nearest neighbor classification, k-means clustering and
kernel-based algorithms such as support vector machines [1–5].In
recent years, many studies have demonstrated empirically and
theoretically that it is often beneficial for a learning task to learn
a metric from the given data, instead of using an off-the-shelf
one such as Euclidean distance metric.
Depending on the availability of the given data, these methods
roughly fall into two main categories: unsupervised metric learn-
ing and supervised metric learning. Each unsupervised metric
learning method is essentially to learn a distance metric without
supervised information [6,7]. While in supervised metric learning,
more information about data such as label information is used to
learn the metric and it is better to capture the idiosyncrasies of
the data of interest [8,9]. We pay particular attention to the super-
vised methods in this paper.
Supervised distance metric learning can be further divided into
task-independent and task-dependent metric learnings. The task-
independent methods usually include two separated learning
steps: in the first step, a metric is learned by solving an optimiza-
tion problem with the supervised information. Then the second
step uses the learned metric to solve a subsequent task. The classi-
cal Linear Discriminant Analysis (LDA) though as a dimensionality
reduction method can also be viewed as a pseudo-metric learning
method [10]. The metric learned by LDA can be used in many
subsequent tasks such as k-nearest neighbor classification. In
addition, MMC by Xing et al. learns a metric by minimizing the
distances in equivalence constraints and maximizing the distances
in inequivalence constraints. Then the metric learned by MMC is
used in different clustering tasks [1].
Though the task-independent methods have used the
supervised information when learning the metrics, such a two step
method cannot guarantee the learned metric is optimal for the
subsequent task. Therefore, a more desirable method is to learn
the metric directly via incorporating the specific subsequent task,
just as the task-dependent distance metric learning. It is similar
to the feature selection problem that embedding methods can
usually achieve better performance than filter methods [11]. The
task-independent metric learning is corresponding to the filter
method and the task-dependent metric learning is corresponding
to the embedding method. One of the most representative works
is Large Margin Nearest Neighbor (LMNN) [2], in which the learned
metric is tailored specially for k-nearest neighbor classification
and leads to significant improvement compared to k-nn with
task-independent metrics. Several related researches have also
been proposed, such as Neighborhood components analysis
(NCA) [4], multi-task LMNN [12] and Non-linear LMNN [13], etc.
It should be noted that most of the existing task-dependent
metric learning methods are designed for classification tasks
especially k-nn. Similar to classification, regression is another
important task in machine learning and its performance is also
http://dx.doi.org/10.1016/j.knosys.2014.04.002
0950-7051/Ó 2014 Elsevier B.V. All rights reserved.
⇑
Corresponding authors. Tel.: +86 15850685790.
E-mail addresses: zou_pc@163.com (P.-C. Zou), s.chen@nuaa.edu.cn (S. Chen).
Knowledge-Based Systems 65 (2014) 21–30
Contents lists available at ScienceDirect
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys