Dissimilarity based ensemble of extreme learning machine
for gene expression data classification
$
Hui-juan Lu
a,b,
n
, Chun-lin An
a
, En-hui Zheng
c
,YiLu
d
a
College of Information Engineering, China Jiliang University, Hangzhou 310018, China
b
School of Information and Electric Engineering, China University of Mining and Technology, Xuzhou 221008, China
c
College of Mechanical and Electric Engineering, China Jiliang University, Hangzhou 310018, China
d
Department of Computer Science, Prairie View A&M University, Prairie View 77446, USA
article info
Article history:
Received 18 September 2012
Received in revised form
4 February 2013
Accepted 11 February 2013
Available online 8 November 2013
Keywords:
Extreme learning machine
Dissimilarity ensemble
Double-fault measure
Majority voting
Gene expression data
abstract
Extreme learning machine (ELM) has salient features such as fast learning speed and excellent
generalization performance. However, a single extreme learning machine is unstable in data classifica-
tion. To overcome this drawback, more and more researchers consider using ensemble of ELMs. This
paper proposes a method integrating voting-based extreme learning machines (V-ELMs) with dissim-
ilarity (D-ELM). First, based on different dissimilarity measures, we remove a number of ELMs from the
ensemble pool. Then, the remaining ELMs are grouped as an ensemble classifier by majority voting.
Finally we use disagreement measure and double-fault measure to validate the D-ELM. The theoretical
analysis and experimental results on gene expression data demonstrate that (1) the D-ELM can achieve
better classification accuracy with less number of ELMs; (2) the double-fault measure based D-ELM
(DF-D-ELM) performs better than disagreement measure based D-ELM (D-D-ELM).
& 2013 Elsevier B.V. All rights reserved.
1. Introduction
Human genome project (HGP) was officially launched in 1990.
In the short span of 20 years, gene technology obtained rapid
development. Golub et al. [1] were the first to use gene chips to
study the human acute leukemia, and found two subtypes of acute
lymphoblastic leukemia: T2Cell ALL and B2Cell ALL. The classifi ca-
tion methods that were used on gene expression data early
include the support vector machine (SVM) [2], artificial neural
networks (ANNs) [3], and probabilistic neural network (PNN) [4].
Jin et al. [5] used the partial least squares method to establish a
classification model. Zhang et al. [6] applied non-negative matrix
factorization (NMF) for the gene expression data classification.
Yang et al. [7] used a binary decision tree to classify gene
expression data of tumor.
The extreme learning machine (ELM) [8] was proposed as an
efficient learning algorithm for single-hidden layer feedforward
neural networks (SLFNs). It increases learning speed by means of
randomly generating weights and biases for hidden nodes rather
than iteratively adjusting network parameters which is commonly
adopted by gradient based methods.
However, the stability of single ELM can be improved. To
achieve better generalization performance, Lan et al. [9] proposed
an ensemble of online sequential extreme learning machine (EOS-
ELM) which is more stable and accurate than the original OS-ELM.
Motiv at ed by the ensemble idea, in 2009 van Heeswijk et al. [10]
proposed an adaptive ensemble model of ELM which is adaptive and
has low computational cost. In 2010, Tian and Meng proposed a
bagging ensemble scheme to combine ELMs [11], and another ELM
ensemble method based on the modified AdaBoost.RT algorithm
[1 2]. In the same year, an ensemble based ELM (EN-ELM) algorithm
wasproposedbyLiuandWang[13] which uses the cross-v alidation
scheme to create an ensemble of ELM classifiers for decision making.
Wang and Li [14] proposed a dynamic Adaboost ensemble ELM
which has been successfully applied to problem of function appro x-
imation and classification application. Zhai et al. [15] proposed a
dynamic ensemble of sample entropy based extreme learning
machines, which can alleviate some extent of instability and over -
fittin g problem, and increase the prediction accuracy. In 2011,
Heeswi jk et al. [1 6] proposed a method which is based on GPU-
accelerat ed and parallelized ELM ensemble, and is used in large-scale
regression. In 2012, Wang and Alhamdoosh [17] proposed an algo-
rithm which employs the model diversity as a fitness function to
direct the selection of base learners, and produces an optimal
solution with ensemble size control. It improved the generalization
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/neucom
Neurocomputing
0925-2312/$ - see front matter & 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.neucom.2013.02.052
☆
This work was supported by the National Natural Science Foundation of China
(Nos. 61272315, 60842009, and 60905034), Zhejiang Provincial Natural Science
Foundation (Nos. Y1110342, Y1080950) and the Pao Yu-Kong and Pao Zhao-Long
Scholarship for Chinese Students Studying Abroad.
n
Corresponding author at: College of Information Engineering, China Jiliang
University, Hangzhou 310018, China. Tel.: þ86 57186914580;
fax: þ86 57186914573.
E-mail addresses: hjlu@cjlu.edu.cn, huijuanlu29@gmail.com (H.-j. Lu).
Neurocomputing 128 (2014) 22–30