different sub-structures. In summary, these local sub-structures cover
the whole process and reflect the natural dynamics of process.
Note that the presentence of jp
i;k
p
i; j
s
z
τ
jk
j and ‖p
i
‖
2
2
= 1 renders the
ASPCA optimization problem non-convex. An algorithm for solving
ASPCA is developed in Algorithm 1.
The online procedure is to adaptively update the ASPCA optimiza-
tion problem with the current measurement z
τ
. The process is
remodeled when any new measurement is available, resulting in large
computation and storage requirements. For simplicity, we partition Z
into B segments Z =[Z
1
, Z
2
, …, Z
B
]
T
, and each seg ment represents an
operating cond ition. The process is remodeled with each seg ment
Z
b
(b =1,2,…, B) for complying with changes in operating conditions
or proces s c haracteristics. Arrange varia nce of each PC λ
i
τ
= p
i
τT
Σ
X
p
i
τ
(i =1,2,…, m) in a descend order into λ
1
τ
=[λ
1
τ
, λ
2
τ
, ⋯, λ
m
τ
]
(λ
1
τ
≥ λ
2
τ
≥⋯≥λ
m
τ
) with corresponding loadings P
τ
=[p
1
τ
, p
2
τ
, ⋯, p
m
τ
].
Then, the matrix X is decomposed into the score matrix T
τ
¼ X
^
P
τ
and
the residual matrix E
τ
¼ XðI−
^
P
τ
^
P
T
τ
Þ, which is given by X ¼ T
τ
^
P
T
τ
þ E
τ
,
where
^
P
τ
¼½p
τ
1
; p
τ
2
; …; p
τ
l
is the loading matrix and Σ
T
τ
¼
^
P
τ
Σ
X
^
P
T
τ
is
the covariance matrix. The number of PCs (l) is selected as the one min-
imizing the following BIC-type criterion.
BIC
l
¼ − log
X
l
i¼1
λ
τ
i
!
þ l
log n
n
l ¼ 1; 2; …; m−1ðÞð3Þ
where the first term decreases and the second term increases with l,
and, theoretically, BIC has a minimum solution. The BIC criterion is
established based on the idea that the score matrix contains mostly var-
iance information and the residual matrix contains noise.
Algorithm 1. (Iterative interior point algorithm for solving ASPCA)
(i) For each sparse loading vector p
i
τ
, without loss of generality,
we individually perform the following steps in the order of i =
1, 2, …, m.
(ii) Start the algorithm by setting the ith PCA loading vector p
i
⁎
as the
initial solution of the ith sparse loading vector, p
i
(0)
= p
i
⁎
.
(iii) For any a ≥ 1 based on the last so lution p
i
τ(a − 1)
, the original
ASPCA optimization problem is revised as the following optimi-
zation problem, where y
i
=[y
i,1
, y
i,2
, ⋯, y
i,m
] denotes the absolute
value terms of p
i
.
p
τ;
i
¼ min
p
i
;t
i
‐p
i
T
Σ
X
p
i
þ β
X
m
j¼1
X
m
k¼1
y
i;k
y
i; j
s
z
τ
jk
s:t:
p
T
i
p
τ a−1ðÞ
i
¼ 1
p
T
i
p
τ
j
¼ 0 j ¼ 1; 2; …; i−1ðÞ
−y
i;k
≤p
i;k
≤y
i;k
k ¼ 1; 2; …; mðÞ
8
>
<
>
:
(iv) Solve the above optimization problem using the interior point
method (available in Matlab). Then, obtain p
i
τ(a)
by normalizing
p
i
τ,∗
: p
i
τ(a)
= p
i
τ,∗
/‖p
i
τ,∗
‖.
(v) Set a = a + 1, and then repeat Steps iii–iv until convergenc e,
‖p
i
τ(a)
− p
i
τ(a − 1)
‖
1
≤ ε,whereε denotes the convergence thresh-
old (take ε = 0.05 in this work).
2.2. Process monitoring based on ASPCA
Based on the ASPCA model built in the above section, the first l PCs
span the PC subspace and the last m–l PCs construct the residual sub-
space. Thus, each measurement is identified by its score Mahalanobis
distance in the PC subspace and the model error in the residual sub-
space. Then, two monitoring statistics, Quasi- T
2
(QT
2
) in the PC sub-
space and SPE in residual subsp ace, are respectively defined and
compared to their corresponding confidence limits as follows.
QT
2
¼ t
τT
∑
−1
T
τ
t
τ
¼ z
T
τ
^
P
τ
∑
−1
T
τ
^
P
T
τ
z
τ
≤ QT
2
lim
ð4Þ
SPE ¼ e
τT
e
τ
¼ z
T
τ
I−
^
P
τ
^
P
T
τ
z
τ
≤ SPE
lim
ð5Þ
where t
τ
¼
^
P
T
τ
z
τ
contains the scores and e
τ
¼ðI−
^
P
τ
^
P
T
τ
Þz
τ
contains the
residuals of z
τ
. Note that if β = 0, ASPCA reduces to PCA. In PCA moni-
toring, PCs are independent with each other and then Σ
T
τ
is diagonal.
In this case, monitoring statistic QT
2
equates to T
2
. However, in ASPCA
monitoring, the extracted PCs are often not independent and barely
conform to a specific distribution, hence, the confidence limit cannot
be determined directly from a particular approximate distribution. An
alternative approach to determine the confidence limit of QT
2
(QT
lim
2
)
is to use kernel density estimation (KDE) [22,23]. A univariate kernel es-
timator is used in this work, and it is defined as
^
fx; dðÞ¼
1
nd
X
n
i¼1
K
x−x
i
d
hi
ð6Þ
where x is the data point under consideration; x
i
is an observation value
from the data set; d is the window width or the smoothing parameter;
n is the number of observations; and K is the kernel function, which
determines the shape of the smooth curve and satisfies the following
conditions:
KxðÞ≥0
Z
þ∞
−∞
KxðÞdx ¼ 1
8
<
:
ð7Þ
In this work, a Gaussian function is chosen for K. Thus, we can deter-
mine the confidence limit with 95% or 99% area coverage of de nsity
function. It should be po inted out that the confidence limit of SPE
(SPE
lim
) can also be determined by KDE. Consequently, a fault is report-
ed to be detected if any one of monitoring statistics QT
2
and SPE violates
its corresponding confidence limit.
2.3. Fault isolation based on ASPCA and dominant PCs
After a fault is detected, it is crucial to root the cause of the out-of-
control status. With the assumption that variables associated with the
fault likely exhibit large contributions, reconstruction based contribu-
tion (RBC) [24] is popularly adopted to isolate faulty variables. Despite
contribution plot approach requires no prior fault information, it leads
to obscure diagnosis because faulty variables can increase contributions
of unaffected variables. To reduce this “smearing” effect, a fault isolation
scheme with dominant PCs is presented in this section.
2.3.1. Dominant principal components
PCs are not of the same sensitivity to a fault. For a specific fault, usually
only several PCs dominate. An indefinite mapping of fault information re-
sults in the relevant information loss and poor monitoring performance.
Therefore, it is of much importance for fault isolation to select fault-
dominant PCs and concentrate the fault information on the selected dom-
inant PCs. The algorithm for selecting dominant PCs proceeds as follows.
Algorithm 2. (Iterative algorithm for selecting dominant PCs)
(i) Start the algorithm by initializing dominant PCs set DT = ∅.
(ii) Select the jth PC with the largest amount of contribution as an-
other new dominant PC. The contribution of each PC i ∉ DT can
be defined by
CT
i
¼ φ
QT
2
t
τ
DT
−φ
QT
2
t
τ
Θ
i
ð8Þ
where, Θ
i
= DT ∪ {i}, reconstructed index φ
QT
2
ðt
τ
a
Þ¼t
τ
a
T
Σ
−1
T
τ;a
t
τ
a
and t
a
τ
, T
τ,a
are obtained by removing all corresponding PCs in the
set a from t
τ
and T
τ
.Theoptimalj value is selected as the one maxi-
mizing the above contribution equation, so j ¼ max
i
CT
i
.
428 K. Liu et al. / Chemometrics and Intelligent Laboratory Systems 146 (2015) 426–436