Semi-supervised Sparse Feature Selection based on
Multi-view Hessian Regularization
Caijuan Shi, Jian Liu, Liping Liu, Xiaodong Yan
College of Information Engineering, North China University of Science and Technology, Tangshan, China
scj-blue@163.com
Abstract—Semi-supervised sparse feature selection has received
increasing attention in recent years. However, most of the semi-
supervised feature selection algorithms are developed for the
single-view data and cannot naturally handle data represented by
multi-view features. Moreover, most existing semi-supervised
sparse feature selection methods are based on Laplacian
regularization, which is lack of extrapolating power. Therefore,
we present a new semi-supervised sparse feature selection
framework based on multi-view Hessian regularization to obtain
better performance in this paper. A simple yet efficient iterative
method is proposed to solve the objective function. We apply the
proposed method into image annotation task and conduct
extensive experiments on two web image datasets. Experimental
results show that the proposed method can realize feature
selection well.
Keywords-multi-view learning; Hessian regularization; semi-
supervised sparse feature selection; web image annotation.
I. INTRODUCTION
Recently, semi-supervised sparse feature selection
approaches have obtained more and more research interest.
However, most of the existing semi-supervised feature
selection methods, such as [1], [2] and [3], are developed for
single-view data. Once these methods confront with multi-view
data, they often directly concatenate multi-view features into a
long vector. We know each type of feature characterizes these
data in one specific feature space and has particular physical
meaning and statistic property. This concatenation strategy
cannot explore the complementary of different view features
efficiently.
It has shown extensively that multi-view learning can
address the above problem to leverage the correlated and
complemental information between different views. In [4], Xu
et al. have reviewed the multi-view learning in detail. In [5]
Feng et al. have proposed an adaptive unsupervised multi-view
feature selection for visual concept recognition. However, to
the best of our knowledge, multi-view learning has not been
applied into semi-supervised sparse feature selection. In this
paper, we apply multi-view learning into our semi-supervised
sparse feature selection framework to select more compact and
accurate feature.
Though the graph Laplacian based semi-supervised
learning approaches have widely been applied into semi-
supervised feature selection [1] [2], Hessian regularization has
better extrapolating power to boost the semi-supervised
learning performance compared to Laplacian regularization[6].
So we apply Hessian regularization into our multi-view semi-
supervised sparse feature selection framework.
In this paper, we propose a new semi-supervised sparse
feature selection framework based on multi-view Hessian
Regularization, namely Multi-view Hessian Regularization
Feature Selection (MHRFS). MHRFS utilizes multi-view
learning and Hessian regularization simultaneously to boost the
performance of the semi-supervised sparse feature selection.
An effective iterative algorithm is proposed to optimize the
objective function. MHRFS is applied into large-scale web
image annotation task and extensive experiments are conduct
on two web image datasets: NUS-WIDE [7] dataset and
MSRA-MM 2.0 [8] dataset.
II. THE PROPOSED FRAMEWORK
A. MHRFS Formulation
In MHRFS framework, a multi-view training dataset of n
observations from m views is given.
is
donated as the training dataset, including q labeled data and n-
q unlabelled data. The feature data matrix of vth view can be
denoted as
and the feature data matrix of all views can be
denoted as
12
, , , ,
T
m
dn
X X X X R
, where
.
Given
be the label matrix of training dataset, where
c is the number of classes and
is the ith label
vector. Denote
12
, , , ,
T
m
nc
F F F F R
as the predicted
label matrix for all views and
and the vth view
predicted label matrix respectively. Let
be the
projection matrix, which is regarded as the combination
coefficients for the most discriminative features. In order to
realize sparse feature selection with the optimal projection
matrix G, we exploit the l
2,1/2
-matrix norm as the sparse model
due to its efficacy [1]. Then the sparse feature selection
framework based on l
2,1/2
-matrix norm can be generalized as
the following objective function:
. (1)
where
is the loss function and
is the
regularization term with λ as regularization parameter. The
definition of
is: