1551-3203 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2018.2871194, IEEE
Transactions on Industrial Informatics
SUBMITTED TO IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 3
Alogrithm 1 Particle filtering with missing outputs
1: Generate L samples {x
l
0
}
l=1,2,...,L
from the initial PDF
p(x
0
) and the weight of each sample is set to w
l
0
= 1/L.
Set k = 1.
2: repeat
3: Generate new particles {x
l
k
}
l=1,2,...,L
according to Eq.
(5).
4: When y
k
is available, calculate the corresponding
weights for all the new particles with Eq. (6); when y
k
is missing, calculate the corresponding weights with Eq.
(7).
5: Calculate N
eff
according to Eq. (8). If N
eff
< N
thd
,
execute resampling step and replace all the particles
generated in step 3 with resampled particles. The weights
of all the new particles are reset to 1/L. If N
eff
≥ N
thd
,
go to the next step.
6: Let k = k + 1 and go to repeat.
7: until k=N.
the real PDF p(x
1:k
|y
o
1
:o
γ
, Θ
s
), the principle of importance
sampling is used here [15]. Suppose q(x
1:k
|y
o
1
:o
γ
, Θ
s
) is
a proposal from which {x
l
1:k
, l = 1, 2, . . . , L} are easily
generated and q(·) is often called an importance density
[15]. The importance density is usually selected as [15], [24]
q(x
k
|y
o
1
:o
γ
, Θ
s
) = p(x
k
|x
k−1
, Θ
s
). (5)
where p(x
k
|x
k−1
, Θ
s
) represents the probability of state tran-
sition at instant k − 1. Based on this choice, the corresponding
weights for the new particles at time k are derived as [24]
w
l
k
∝ w
l
k−1
p(y
k
|x
l
k
, Θ
s
). (6)
When the output data are missing, the new particles are
generated from p(x
k
|x
l
k−1
, Θ
s
) and the corresponding weights
keep unchanged [15], [24]
w
l
k
= w
l
k−1
. (7)
Moreover, the resampling step is applied here to deal with
the degeneracy of the particles [15]. The weight of every
particle after resampling is then reset to w
l
k
= 1/L [24].
In order to judge whether the resampling step should be
performed, N
eff
is calculated to denote the effective particle
number [15]
N
eff
=
1
L
l=1
(w
l
k
)
2
. (8)
where w
l
k
is iteractively calculated based on Eqs. (6) or (7).
If N
eff
falls below the threshold N
thd
, a degeneracy problem
occurs and the resampling step is required [15], [27]. The main
steps of particle filter are concluded in Algorithm 1.
IV. THE DETAILED DERIVATIONS OF THE PROPOSED
METHOD
A. Brief introduction to EM algorithm
EM algorithm is an effective method in the area of system
identifications and shows superiority over conventional max-
imum likelihood estimates (MLE) in incomplete-data cases
[14], [15]. The main content of EM algorithm is to iteratively
execute the expectation step (E-step) and the maximization
step (M-step) until the desired parameters obtained [19], [21].
In the E-step, the expected value of log-likelihood of the
complete data (Q-function) is calculated based on the currently
estimated parameter Θ
s
as [19]
Q(Θ|Θ
s
) = E
C
mis
|C
obs
,Θ
s
{log p(C
mis
, C
obs
|Θ)}, (9)
In the M-step, the estimated parameter is updated through
maximizing the Q-function as [10], [22]
Θ = arg max
Θ
Q(Θ|Θ
s
). (10)
The EM algorithm is an iterative algorithm and guarantees
convergence with the Q-function keeping nondecreasing in
each iteration [28]. The initial choice Θ
1
is important for the
EM algorithm.
B. The derivations of Q-function
Based on Eqs. (1)-(3) and the probability chain rule, the
complete data log-likelihood log p(C
mis
, C
obs
|Θ), can be sim-
plified as
log p(u
1:N
, x
1:N
, y
1:N
, λ
1:N
|Θ)
= C
1
+
N
k=1
log p(y
k
|u
k
, x
k
, λ
k
, σ, θ
h
) +
N
k=1
log p(λ
k
|r)
+
N
k=2
log p(x
k
|x
k−1
, u
k−1
, δ, θ
f
) + log p(x
1
|Θ
1
). (11)
where C
1
= log p(u
1:N
, Θ) is a constant. Considering system
output data y
1:N
= {y
o
1
:o
α
, y
m
1
:m
β
}, the conditional expecta-
tions of log p(C
mis
, C
obs
|Θ) with respect to λ
1:N
, x
1:N
, and
y
m
1
:m
β
can be calculated as
Q(Θ|Θ
s
) =
o
α
k=o
1
p(x
k
|C
obs
, Θ
s
)
p(λ
k
|C
obs
, x
k
, Θ
s
)
log p(y
k
|u
k
, x
k
, λ
k
, σ, θ
h
)dλ
k
dx
k
+
m
β
k=m
1
p(x
k
, y
k
|C
obs
, Θ
s
)
p(λ
k
|C
obs
, x
k
, y
k
, Θ
s
) log p(y
k
|u
k
, x
k
, λ
k
, σ, θ
h
)
dλ
k
dx
k
dy
k
+
o
α
k=o
1
p(x
k
|C
obs
, Θ
s
)
p(λ
k
|C
obs
, x
k
,
Θ
s
) log p(λ
k
|r)dλ
k
dx
k
+
m
β
k=m
1
p(x
k
, y
k
|C
obs
, Θ
s
)
p(λ
k
|C
obs
, x
k
, y
k
, Θ
s
) log p(λ
k
|r)dλ
k
dx
k
dy
k
+
N
k=2
[p(x
k
, x
k−1
|C
obs
, Θ
s
) log p(x
k
|x
k−1
, u
k−1
, δ, θ
f
)]
dx
k
dx
k−1
+
p(x
1
|C
obs
, Θ
s
) log p(x
1
|Θ
1
)dx
1
+ C
1
. (12)