ORIGINAL ARTICLE
Research of neural network algorithm based on factor analysis
and cluster analysis
Shifei Ding
•
Weikuan Jia
•
Chunyang Su
•
Liwen Zhang
•
Lili Liu
Received: 16 September 2009 / Accepted: 23 June 2010 / Published online: 7 July 2010
Springer-Verlag London Limited 2010
Abstract Aiming at the large sample with high feature
dimension, this paper proposes a back-propagation (BP)
neural network algorithm based on factor analysis (FA) and
cluster analysis (CA), which is combined with the princi-
ples of FA and CA, and the architecture of BP neural
network. The new algorithm reduces the feature dimen-
sionality of the initial data through FA to simplify the
network architecture; then divides the samples into differ-
ent sub-categories through CA, trains the network so as to
improve the adaptability of the network. In application, it is
first to classify the new samples, then using the corre-
sponding network to predict. By an experiment, the new
algorithm is significantly improved at the aspect of its
prediction precision. In order to test and verify the validity
of the new algorithm, we compare it with BP algorithms
based on FA and CA.
Keywords Artificial neural network (ANN) Factor
analysis (FA) Cluster analysis (CA) FA-CA-BP network
1 Introduction
Artificial neural network (ANN) is a kind of cross-subject,
which combines with Brain Science, Neuroscience,
Cognitive Science, Psychology, Computer Science, and
Mathematics [1]. It has many important applications in
nature science, such as Earth Science [2], Environmental
Science [3], and Physical Science [4]. Artificial neural
network simulates the structure of the human brain neural
network and some working mechanism to establish one
kind of computing model. Artificial neural network has
some characteristics such as self-adaption, self-organiza-
tion and real-time learning, and powerful ability in dealing
with processing non-linear problem and large-scale com-
putation. Neural network has been more than 60 years until
now. During these years, hundreds of network algorithm
models have been proposed [5], and back-propagation (BP)
neural network is one of the most mature and most wide-
spread algorithms [6]. Artificial neural network is conve-
nient for people to solve the problems, but it is not perfect
for the feature of the input samples and the properties of the
network’s structure. For example, a large number of ori-
ginal samples can be used to provide available information,
while also increase the difficulty to deal with these data for
the neural network, there is some related, or even repeated
information which exists in the features of the samples. If
we take all of its data as the network input, it will be
detrimental to the design of the network, and will occupy a
lot of storage space and computing time. Too many feature
inputs and repeated training samples will lead to time-con-
suming work and hinder the convergence of the network,
finally affect the recognition precision of the network. So it
is necessary to pre-process the original data, analyze and
extract useful variable features from a large amount of
data, excluding the influences of the related or duplicate
factors. It is also important to reduce the feature dimen-
sionality as far as possible under the premise of not
affecting the solution of the problems and then classify the
similar samples in order to simplify the network structure.
S. Ding (&) W. Jia C. Su L. Zhang L. Liu
School of Computer Science and Technology,
China University of Mining and Technology,
Xuzhou 221008, China
e-mail: dingsf@cumt.edu.cn
S. Ding
Key Laboratory of Intelligent Information Processing,
Institute of Computing Technology, Chinese Academy
of Sciences, Beijing 100080, China
123
Neural Comput & Applic (2011) 20:297–302
DOI 10.1007/s00521-010-0416-2