978-1-5386-1106-7/17/$31.00 ©2017 IEEE 1427
The 2017 4th International Conference on Systems and Informatics (ICSAI 2017)
Large Margin Distribution Machine
Recursive Feature Elimination
Ge Ou, Yan Wang
College of Computer Science and Technology
Jilin University
Changchun, China
*E-mail: wy6868@jlu.edu.cn
Wei Pang*, George Macleod Coghill
Department of Computing Science
University of Aberdeen
Aberdeen, UK
*E-mail: pang.wei@abdn.ac.uk
Abstract—In order to eliminate irrelevant features for
classification, we propose a novel feature selection algorithm
called Large Margin Distribution Machine Recursive Feature
Elimination (LDM-RFE). LDM-RFE uses the latest support
vector based classification algorithm Large Margin Distribution
Machine (LDM) to evaluate all the features of samples, and then
generates a ranked feature list during the procedure of Recursive
Feature Elimination (RFE). In the experiment section, we report
promising results obtained by LDM-RFE in comparison with
several common feature selection algorithms on five UCI
benchmark datasets.
Keywords-feature selection; large margin distribution machine;
recursive feature elimination; classification
In classification, feature selection [1] is a very important
technique used to avoid overfitting and reduce computational
complexity [2]. There exist many feature selection algorithms
used for machine learning [3][4], however, many of them can
be used in all kinds of tasks and not specific for classification.
Some feature selection algorithms, such as Principal
Components Analysis (PCA) [5], t-test [6], and kullback-
Leibler divergence [7], can be used for any machine learning
models. But among these algorithms, Support Vector Machine
Recursive Feature Elimination (SVM-RFE) [8] is specifically
aimed to deal with classification tasks and it has better
performance than other commonly used feature selection
algorithms in many problems, especially for high-dimension
problems. Furthermore, some related feature selection
algorithms for classification has been proposed. Su and Hsiao
[9] proposed a Multiclass Mahalanobis-Tanguchi system for
feature selection and simultaneous multiclassification. Wang
[10] studied a feature selection algorithm for big data
problems. Liu [11] proposed a framework for multiclass
sentiment classification. In addition, the study of classification
model has made new progress over the last few years. Zhou
and Zhang [12] proposed Large Margin Distribution Machine
(LDM) algorithm, which has better classification performance
than Support Vector Machine (SVM) [13] in the tested
problems. LDM is based on the novel theory of optimizing the
margin distribution, and it used the dual coordinate descent
(DCD) [14] strategies and the averaged stochastic gradient
descent (ASGD) [15] strategies to solve the optimization
Considering the above, in this research we propose a novel
RFE algorithm for classification based on LDM, which we call
Large Margin Distribution Machine Recursive Feature
Elimination (LDM-RFE). The proposed LDM-RFE ranks
problem features by their contributions to build the LDM
model at each iteration and eliminates irrelevant features
progressively. Our proposed LDM-RFE is compared with
several commonly used feature selection algorithms, such as t-
test, PCA, and SVM-RFE. The experimental results indicate
that our proposed LDM-RFE leads to better performance than
several other algorithms on five UCI [16] benchmark data sets.
={( , ),...,( , )}
Sxy xy
be a training set of
are the input samples and
y =− +
is the
label set. The objective function in classification problems is
() ()