![](https://csdnimg.cn/release/download_crawler_static/10229312/bg3.jpg)
2308 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 28, NO. 10, OCTOBER 2017
bearing and electric cooling fan. Fink et al. [26] proposed
a multilayer feed-forward NN with multivalued neurons
approach to deal with the reliability and degradation time-
series prediction problem, and conducted a case study on
predicting the degradation of a railway turnout system.
Many existing data-driven prognostic approaches involve
one single model which can hardly maintain good gener-
alization performance across various prognostic scenarios,
especially when this model is well configured for a certain sce-
nario. Moreover, most approaches employ handcrafted features
which can hardly adapt to different prognostic applications.
In this paper, an ensemble of DBNs is used to address
these two issues. We propose a multiobjective evolutionary
ensemble learning framework incorporated with the traditional
DBN training technique to evolve multiple DBNs subject to
accuracy and diversity as two conflicting objectives. All of the
eventually evolved DBNs are combined to establish an ensem-
ble model for RUL prediction, where combination weights
are optimized via DE using a task-oriented objective. Two
prerequisites required by our proposed method include the
following:
1) the maximally allowed system degradation level (i.e., the
threshold of system failure);
2) the relevant historical data (sensor data, environmental
and/or operational parameters, and so on) of the system.
Based on historical data, prognostic models can be established
to estimate the RUL.
B. Ensemble Learning
Many works [36]–[39] have demonstrated that a single
method can seldom perform consistently well across a
variety of applications. An ensemble of multiple models
(i.e., base learners) with an appropriate combination of
individual models’ outputs can take advantage of each
ensemble member so as to improve generalization. Although
an ensemble may perform better than any of its members,
it is not easy to construct a good ensemble. The most crucial
task is to obtain base learners which individually outperform
random guessing and meanwhile are less correlated with
each other in terms of training errors [40]. The previous
works [41], [42] indicated that an ideal ensemble should
consist of base learners which can generate smaller errors and
meanwhile have good diversity among themselves. Increasing
diversity among base learners is a way to pursue better gen-
eralization. Finding an appropriate combination of diversified
ensemble members is another key task in ensemble learning.
Ensemble methods have also been applied to improve the
ability of handling noisy sensor data and complex relations
underlying the data. Opitz and Maclin [43] had shown that
good accuracy and diversity of a base learner are sufficient
conditions to enable an ensemble to outperform any of its
members. However, these two objectives are often conflict-
ing [44]. As a result, MOEAs [45]–[50] can provide a solution.
Existing works on MOEAs-based ensembles include negative
correlation learning (NCL) [51], memetic Pareto artificial
NN [52], [53], and diverse and accurate ensemble learning
algorithm [44]. In fact, many existing ensemble prognostic
Fig. 1. Architecture of a DBN composed of three stacked RBMs, where
v represents the input layer, h
i
(i = 1,...,3) denote three hidden layers,
and w
i
(i = 1,...,3) are connection weight matrices in three RBMs.
approaches only rely on one single objective to train their
ensemble members, e.g., mean square error [30], [31], [54],
root mean square error (RMSE) [15], and PHM08 score [4],
[8], [15], [31], which may lead to undesirable generalization
performance. In this paper, we use accuracy and diversity as
two conflicting objectives to train the base learners of the
ensemble.
C. Deep Belief Networks
In recent years, DBNs have become one of the most active
research topics [55] in machine learning and been successfully
applied in a variety of fields, such as computer vision [56],
acoustic classification [57], phoneme recognition [58], and
natural language processing [59]. These success stories reveal
that DBNs are capable of learning the most suitable feature
representations from data. This paper aims to explore the
application of DBNs in prognostics which has been seldom
studied in the field of prognostics.
A majority of traditional data-driven prognostic approaches
need to first extract (and select) a set of handcrafted features
and then utilize these features (in the form of a feature vector)
as the input of the learning model. However, such approaches
are usually unable to capture more relevant and useful feature
representations. For complex engineering systems, features
may have some hierarchical relations. A single-hidden-layer
NN may not be able to capture such relations. Furthermore,
traditional NNs are very inefficient to be trained when having
a large amount of hidden neurons. Moreover, many hidden
layers may not necessarily lead to better performance due to
the dilution of the correction signal when using BP [22], [23].
Therefore, traditional NNs-based approaches usually constrain
the network to contain one or two hidden layers.
The DBN is capable of extracting a hierarchy of features,
where features at higher network levels are usually more
relevant to the ultimate task. It is composed of stacked
restricted Boltzmann machines (RBMs) which can be trained
via contrastive divergence [21] in an unsupervised manner.
1) Restricted Boltzmann Machine: The RBM [60] is a two-
layer undirected graph model, which is typically trained in an
unsupervised manner. It is composed of one visible layer and
one hidden layer with no connections allowed between nodes
at the same layer.
2) Deep Belief Network: The DBN was proposed by
Hinton et al. [21], which is composed of several stacked
RBMs [60], as shown in Fig. 1. It typically involves layerwise