Batch Process Monitoring with GTucker2 Model
Lijia Luo,*
,†
Shiyi Bao,
†
Zengliang Gao,
†
and Jingqi Yuan
‡
†
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou, China
‡
Department of Automation, Shanghai Jiao Tong University, Shanghai, China
ABSTRACT: In this paper, the GTucker2 model is proposed for monitoring both even-length and uneven-length batch
processes. The GTucker2 model has two prominent advantages. The first one is that it performs tensor decomposition on the
three-way data array and thus avoids potential problems of information loss and “curse of dimensionality” induced by data
unfolding. The second one is that it solves the uneven-length problem in a “natural” way without using batch trajectory
synchronization, which prevents distorting data and fault patterns and guarantees higher modeling and monitoring precisions. An
online batch process monitoring method is then developed by integrating GTucker2 with the moving data window technique.
Three monitoring statistics named Q, R
2
, and T
2
statistics are constructed for fault detection and diagnosis. The effectiveness and
advantages of the GTucker2-based monitoring method are illustrated by two case studies in a benchmark fed-batch penicillin
fermentation process.
1. INTRODUCTION
Batch processes have been widely used to manufacture low-
volume and high-value-added products in many industrial
sectors, including pharmaceutical, biochemical, food, and
semiconductor industries.
1−3
The batch process has some
advantages over the continuous process, such as high flexibility
and quick response to the changing market needs. With
increasing global competition, the batch process is playing a
more and more important role in industrial production. Process
safety and consistent product quality are always two important
issues of batch processes that have caught much attention. This
makes process monitoring a highly necessary part of process
operation. The aim of process monitoring is to achieve an early
warning of abnormal operating conditions that may result in
production interrupt ion, low-quality p roducts, and even
equipment damage. Such an early warning provides a chance
to take corrective actions to recover normal production or at
least to avoid the waste of more raw materials. However,
achieving reliable and robust monitoring for batch processes is
a challenging problem, which suffers from those intractable
process features, such as multiphasing, heavy coupling,
nonlinearity, time variance, and so on.
4
As contemporary batch processes become more and more
flexible and complicated, monitoring metho ds based on
mathematical models or process knowledge are not applicable
in many cases. On the other hand, with the fast development of
process automation techniques, a large amount of process data
is recorded. These data contain much useful information about
process operating status and product quality for modeling,
monitoring, and control. This motivates the development of
data-driven statistical process control concepts and methods.
Since Nomikos and MacGregor
1
first proposed the MPCA-
based monitoring method, multivariate statistical process
control (MSPC) has become a hot topic in recent years.
Different from continuous processes, historical data collected
from a batch process usually have the form of a three-way array.
This three-way data array is mainly analyzed with bilinear and
trilinear methods. Up to now, bilinear methods are adopted
more often than trilinear methods. Bilinear methods need to
unfold the three-way data array into a matrix before building a
monitoring model. Typical bilinear methods include multiway
principal component analysis (MPCA),
1
multiway partial least-
squares (MPLS),
2
multiway independent component analysis
(MICA),
5
multiway locality preserving projections (MLPP),
6
and so on. Trilinear methods model the three-way data array
directly with tensor decomposition, such as trilinear decom-
position (TLD),
7
parallel factor analysis (PARAFAC),
8,9
and
Tucker3 decomposition.
9
Since batch data set are inherently
three-dimensional, data unfolding may destroy the intrinsic data
structure, potentially leading to information loss. Moreover,
data unfolding is easy to induce the “curse of dimensionality”
problem for a larger scale data set, which means a lot of model
parameters need to be identified,
3,9
hindering the establishment
of a monitoring model. These drawbacks reduce the monitoring
performance of bilinear methods. Compared with bilinear
methods, trilinear methods are expected to be more reliable and
stable,
9
because they avoid the data unfolding and in the mean
time compress data in two or three directions, which is benefit
to building a more concise and precise monitoring model.
Most batch process monitoring methods are based on the
assumption that all batches have the same duration. Apparently,
this assumption is too idealistic, since the duration of a practical
batch process is difficult to fix due to unavoidable disturbances
and changes in operating conditions. In such a situation, the
batch trajectory synchronization is ofte n ado pted.
10
The
simplest trajectory synchronization method is to cut all batches
to the minimum length.
11
The other method is to treat the
absent parts of shorter batch trajectories as missing data and
then try to derive a model with data from long batches.
12
A
more reasonable method is to find a surrogate variable that has
the same starting and ending value for all batches to replace the
Received: April 11, 2014
Revised: September 1, 2014
Accepted: September 4, 2014
Published: September 4, 2014
Article
pubs.acs.org/IECR
© 2014 American Chemical Society 15101 dx.doi.org/10.1021/ie5015102 | Ind. Eng. Chem. Res. 2014, 53, 15101−15110