Optimal Linear Combination of Neural Network Classifiers
Based on the Minimum Classification Error Criterion
Naonori Ueda
NTT Communication Science Laboratories, Kyoto, Japan 619-0237
SUMMARY
Focusing on classification problems, this paper pre-
sents a new method for linearly combining multiple neural
network classifiers based on statistical pattern recognition
theory. In our approach, several neural networks are first
selected, each of which works best for each class in terms
of minimizing classification errors. Then, they are linearly
combined and form an ideal classifier able to take advan-
tage of the strengths of the individual classifiers, to avoid
their weaknesses, and to improve all of the individual
classifiers. In this approach, the minimum classification
error (MCE) criterion is utilized to estimate the optimal
linear weights. In this formulation, because the classifica-
tion decision rule is incorporated into the cost function,
better combination weights suitable for the classification
objective can be obtained. Experimental results using arti-
ficial and real data sets show that the proposed method can
construct a better combined classifier which outperforms
the best single classifier in terms of the overall classification
errors for test data. © 2000 Scripta Technica, Syst Comp
Jpn, 31(9): 3948, 2000
Key words:
Pattern classification; ensemble learn-
ing; linear combination; minimum classification error dis-
criminant; neural network.
1. Introduction
Compared with a single estimator, estimators that
have been combined have been experimentally and theo-
retically shown to be better in improving generalization
errors [110]. The output of a combined estimator for some
input is usually defined as a linear combination of outputs
of multiple estimators,where it is assumed that each estima-
tor is separately constructed by using the same training data.
For classification problems, it has been shown that combin-
ing multiple unstable classifiers such as trees or neural
networks results in decreasing classification errors for test
data, and as a result, the combining of classifiers is regarded
as a variance reducing device [35].
Ideally, however, such combination of classifiers
should take advantage of the strengths of the individual
classifiers, avoid their weaknesses, and improve all of the
individual classifiers. It is well known that the best single
neural network resulting in the minimum amount of classi-
fication errors can be obtained by using regularization
techniques [15]. However, since the complexity of class
boundaries is not necessarily uniform over all classes in a
feature space, it may be better to develop a situation where
a neural network works best for a class, while another
neural network works best for another class. In such a case,
we expect the combination of these classifiers to improve
the minimization result. In this paper, I will propose a
method for designing such ideal combining neural network
classifiers.
The method proposed in this paper is motivated by
an attempt to achieve such an ideal combination for improv-
ing the classification performance. That is, our goal here is
to develop a linear combination method for constructing not
a stable classifier, but the best classifier that outperforms
the individual classifiers, in terms of minimizing the clas-
sification errors for test data. In our approach, by changing
a regularization parameter, several neural networks are first
© 2000 Scripta Technica
Systems and Computers in Japan, Vol. 31, No. 9, 2000
Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J82-D-II, No. 3, March 1999, pp. 522530
39