Automatica 50 (2014) 2777–2786
Contents lists available at ScienceDirect
Automatica
journal homepage: www.elsevier.com/locate/automatica
Detecting abnormal situations using the Kullback–Leibler divergence
✩
Jiusun Zeng
a,d
, Uwe Kruger
b,1
, Jaap Geluk
c
, Xun Wang
c
, Lei Xie
d,2
a
College of Metrology and Measurement Engineering, China Jiliang University, Hangzhou 310018, PR China
b
Department of Mechanical & Industrial Engineering, Sultan Qaboos University, P.O. Box 33, Al Khod, Oman
c
Department of Mathematics, The Petroleum Institute, P.O. Box 2533, Abu Dhabi, United Arab Emirates
d
State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, PR China
a r t i c l e i n f o
Article history:
Received 9 November 2013
Received in revised form
11 June 2014
Accepted 17 June 2014
Available online 3 October 2014
Keywords:
Kullback–Leibler divergence
Multivariate probability density function
Incipient fault condition
Fault detection
Increased sensitivity
a b s t r a c t
This article develops statistics based on the Kullback–Leibler (KL) divergence to monitor large-scale
technical systems. These statistics detect anomalous system behavior by comparing estimated density
functions for the current process behavior with reference density functions. For Gaussian distributed
process variables, the paper proves that the difference in density functions, measured by the KL
divergence, is a more sensitive measure than existing work involving multivariate statistics. To cater
for a wide range of potential application areas, the paper develops monitoring concepts for linear static
systems, that can produce Gaussian as well as non-Gaussian distributed process variables. Using recorded
data from a glass melter, the article demonstrates the increased sensitivity of the KL-based statistics by
comparing them to competitive ones.
© 2014 Elsevier Ltd. All rights reserved.
1. Introduction
Detecting abnormal operating conditions is of fundamental im-
portance to ensure the safe, reliable and economic operation of
technical systems. Related research can be broadly divided into
model-based, signal-based, rule-based and knowledge-based tech-
niques and their applications span over a wide range including the
general manufacturing industry, automotive and aircraft as well as
civil engineering and chemical systems (Kruger & Xie, 2012). From
the availability of large data records that are routinely updated, the
application of multivariate statistics has also gained significant at-
tention over the past decades (Dunia, Qin, Edgar, & McAvoy, 1996;
Feital et al., 2010; Ge, Xie, Kruger, & Song, 2012; Kano, Hasebe,
Hashimoto, & Ohno, 2004; Kourti, 2005; Lee, Qin, & Lee, 2006; Liu,
Xie, Kruger, Littler, & Wang, 2008; Miletic, Quinn, Dudzic, Vaculik,
✩
This work was supported by the Petroleum Institute, internal grant RIFP-
14301, and the Natural Science Foundation of China, grant numbers 61203088,
61320106009, 61374121. The material in this paper was not presented at any
conference. This paper was recommended for publication in revised form by
Associate Editor Juergen Hahn under the direction of Editor Frank Allgöwer.
E-mail addresses: jszeng@cjlu.edu.cn (J. Zeng), uwe.kruger@gmail.com
(U. Kruger), jgeluk@pi.ac.ae (J. Geluk), xwang@pi.ac.ae (X. Wang),
leix@csc.zju.edu.cn (L. Xie).
1
Tel.: +968 2414 2549; fax: +968 2414 1316.
2
Tel.: +86 571 87952233; fax: +86 571 87951200.
& Champagne, 2004; Venkatasubramanian, Rengaswamy, Kavuri,
& Yin, 2003), mainly based on their conceptual simplicity (Kruger
& Xie, 2012).
Multivariate statistical approaches rely, predominantly, on
non-causal data structures identified using principal component
analysis (PCA) (AlGhazzawi & Lennox, 2008; Dunia et al., 1996;
Feital et al., 2010), for Gaussian distributed source signals, and
independent component analysis (Ge et al., 2012; Kano et al., 2004;
Lee et al., 2006) as a non-Gaussian extension. The independent
components are embedded within the PCA components (Liu et al.,
2008), which follows from the data structure:
y(k) = Cs(k) + e(k). (1)
Here, y ∈ R
d
y
is a measured data vector, s ∈ R
d
s
, d
s
< d
y
, is a
vector of source variables that has the density function f
s
and the
covariance matrix 6
s
, C ∈ R
d
y
×d
s
is a parameter matrix, e ∈ R
d
y
is an error vector that has a Gaussian distribution, e ∼ N {0, 6
e
},
and k is a sample index. A practically reasonable assumption is that
the non-diagonal elements of 6
e
are zero. If 6
e
= σ
2
e
I, an eigen-
decomposition of the covariance matrix of y, 6
y
= C6
s
C
T
+ 6
e
,
yields σ
2
e
and the column space of C (Kruger & Xie, 2012). Con-
versely, if 6
e
= σ
2
e
I, the application of maximum likelihood PCA
allows the estimation of the diagonal elements of 6
e
and the col-
umn space of C (Feital et al., 2010; Kruger & Xie, 2012; Liu et al.,
2008; Narasimhan & Shah, 2008).
Using a moving window approach, Kruger, Kumar, and Littler
(2007) showed that evaluating changes in the underlying geometry
http://dx.doi.org/10.1016/j.automatica.2014.09.005
0005-1098/© 2014 Elsevier Ltd. All rights reserved.