This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY 1
Neighborhood Variational Bayesian Multivariate
Analysis for Distributed Process Monitoring
With Missing Data
Qingchao Jiang , Member, IEEE, Xuefeng Yan , and Biao Huang , Fellow, IEEE
Abstract—Conventional methods for distributed monitoring
commonly assume that complete process measurements are avail-
able. However, the problem of missing data is often encountered
in the monitoring of large-scale multiunit processes. This paper
proposes an approach based on a neighborhood variational
Bayesian principal component analysis (NVBPCA) and canonical
correlation analysis (CCA) for the efficient distributed monitoring
of multiunit processes in the presence of missing data. Missing
observations for a local unit are reconstructed through NVBPCA
by considering information from both local and neighboring
units. A CCA-based local monitor, which identifies the status of
the local unit and the type of a detected fault using information
from both the local and neighboring units, is then developed.
The NVBPCA–CCA approach has a better performance since its
missing data handling and local monitor construction consider
information from both the local and neighboring units. The
efficiency of the proposed monitoring method is demonstrated
through its application in a numerical example and an industrial
tail gas treatment process.
Index Terms—Canonical correlation analysis (CCA), distrib-
uted process monitoring, fault detection, missing data, variational
Bayesian principal component analysis (VBPCA).
I. INTRODUCTION
F
AULT detection plays an important role in maintain-
ing plant safety [1]–[4]. Data-driven process monitoring
has become popular with the progression of data collection
and transmission techniques [5]–[9]. Currently, plant-wide
processes are generally of large scale and consist of multiple
operation units. Monitoring of such processes is necessary but
Manuscript received June 6, 2018; revised August 5, 2018; accepted
September 9, 2018. Manuscript received in final form September 11, 2018.
This work was supported in part by the National Natural Science Foundation
of China under Grant 61603138, in part by the Shanghai Pujiang Pro-
gram under Grant 17PJD009, in part by the Fundamental Research Funds for
the Central Universities under Grant 222201717006 and Grant 222201714027,
in part by the Program of Introducing Talents of Discipline to Universities
(the 111 Project) under Grant B17017, and in part by the Natural Science and
Engineering Research Council of Canada. Recommended by Associate Editor
P. Mha sk ar. (Corresponding authors: Xuefeng Yan; Biao Huang.)
Q. Jiang and X. Yan are with the Key Laboratory of Advanced Control
and Optimization for Chemical Processes, Ministry of Education, East China
University of Science and Technology, Shanghai 200237, China (e-mail:
qchjiang@ecust.edu.cn; xfyan@ecust.edu.cn).
B. Huang is with the Department of Chemical and Materials Engi-
neering, University of Alberta, Edmonton, AB T6G 2V4, Canada (e-mail:
bhuang@ualberta.ca).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCST.2018.2870570
challenging [10], [11]. Multiblock and distributed monitoring
methods are of significant interest in the monitoring of large-
scale multiunit processes [12].
A multiblock or distributed monitoring method decom-
poses a process into subblocks to reduce complexity and
enable the exploration of local process behaviors [10], [13].
A method based on multiblock partial least square (PLS)
and principal component analysis (PCA) is developed in [14].
In [15], several multiblock PCA and multiblock PLS methods
are compared with each other. In [16], multiblock meth-
ods are further analyzed and improved. Data-driven distrib-
uted monitoring methods have been developed recently. For
example, in [13], distributed PCA monitors are developed
on the basis of loading vectors. Mutual-information-based
distributed monitors are developed in [17]. A performance-
driven distributed monitoring method is proposed in [18],
and a distributed monitoring framework is presented in [12].
More recently, distributed monitoring methods with big data
have been developed [19], [20]. A parallel PCA-kernel PCA
method is developed in [21]. These multiblock or distributed
monitoring methods lay the foundation for large-scale process
monitoring. Nevertheless, the following deficiencies need to be
further addressed: first, multiblock or distributed monitoring
methods do not elaborate on the monitoring of local units.
Second, multiblock or distributed monitoring methods do not
identify the type of a detected fault. Third, the missing data
problem encountered in the monitoring of large-scale multiunit
processes has not been addressed.
Canonical correlation analysis (CCA), a basic multivariate
analytical method, has promising applications in monitoring
large-scale multiunit processes. A CCA-based residual gener-
ation approach has been recently proposed in [22]. A CCA
approach integrated with a randomized algorithm for non-
Gaussian fault detection is presented in [23]. In [24], a multiset
CCA is used to extract the joint feature and individual features
of multiunit processes. This approach is further extended to
parallel running batch processes in [25]. In [26], a CCA-
based distributed monitoring approach for large-scale multiunit
processes is proposed. In [27], a time-slice CCA is proposed
for key unit monitoring of batch processes, in which the type
of a detected fault, i.e., relevant or irrelevant with other units,
is identified. However, the following deficiencies exist in the
CCA-based methods: first, the common missing data problem
is not discussed. The CCA-based monitoring focuses much
1063-6536 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.