MANUSCRIPT 13
finite-dimensional systems with white noise assump-
tion.
• Kalman filter is an unbiased minimum variance estima-
tor under LOG circumstance. When the Gaussian as-
sumption of noise is violated, Kalman filter is still opti-
mal in a mean square sense, but the estimate doesn’t
produce the condition mean (i.e. it is biased), and
neither the minimum variance. Kalman filter is not
robust because of the underlying assumption of noise
density model.
• Kalman filter provides an exact solution for linear
Gaussian prediction and filtering problem. Concerning
the smoothing problem, the off-line estimation version
of Kalman filter is given by the Rauch-Tung-Striebel
(RTS) smoother [384], which consists of a forward fil-
ter in a form of Kalman filter and a backward recursive
smoother. The RTS smoother is computationally effi-
cient than the optimal smoother [206].
• The conventional Kalman filter is a point-valued fil-
ter, it can be also extended to set-valued filtering [39],
[339], [80].
• In the literature, there exists many variants of Kalman
filter, e.g., covariance filter, information filter, square-
root Kalman filters. See [205], [247] for more details
and [403] for a unifying review.
C. Optimum Nonlinear Filtering
In practice, the use of Kalman filter is limited by the
ubiquitous nonlinearity and non-Gaussianity of physical
world. Hence since the publication of Kalman filter, numer-
ous efforts have been devoted to the generic filtering prob-
lem, mostly in the Kalman filtering framework. A number
of pioneers, including Zadeh [503], Bucy [61], [60], Won-
ham [496], Zakai [505], Kushner [282]-[285], Stratonovich
[430], [431], investigated the nonlinear filtering problem.
See also the papers seeking optimal nonlinear filters [420],
[289], [209]. In general, the nonlinear filtering problem per
sue consists in finding the conditional probability distribu-
tion (or density) of the state given the observations up to
current time [420]. In particular, the solution of nonlinear
filtering problem using the theory of conditional Markov
processes [430], [431] is very attractive from Bayesian per-
spective and has a number of advantages over the other
methods. The recursive transformations of the posterior
measures are characteristics of this theory. Strictly speak-
ing, the number of variables replacing the density function
is infinite, but not all of them are of equal importance.
Thus it is advisable to select the important ones and reject
the remainder.
The solutions of nonlinear filtering problem have two cat-
egories: global method and local method. In the global ap-
proach, one attempts to solve a PDE instead of an ODE
in linear case, e.g. Zakai equation, Kushner-Stratonovich
equation, which are mostly analytically intractable. Hence
the numerical approximation techniques are needed to solve
the equation. In special scenarios (e.g. exponential family)
with some assumptions, the nonlinear filtering can admit
the tractable solutions. In the local approach, finite sum
approximation (e.g. Gaussian sum filter) or linearization
techniques (i.e. EKF) are usually used. In the EKF, by
defining
ˆ
F
n+1,n
=
df(x)
dx
x=
ˆ
x
n
,
ˆ
G
n
=
dg(x)
dx
x=
ˆ
x
n|n−1
,
the equations (2a)(2b) can be linearized into (3a)(3b), and
the conventional Kalman filtering technique is further em-
ployed. The details of EKF can be found in many books,
e.g. [238], [12], [96], [80], [195], [205], [206]. Because EKF
always approximates the posterior p(x
n
|y
0:n
) as a Gaus-
sian, it works well for some types of nonlinear problems,
but it may provide a poor performance in some cases when
the true posterior is non-Gaussian (e.g. heavily skewed or
multimodal). Gelb [174] provided an early overview of the
uses of EKF. It is noted that the estimate given by EKF is
usually biased since in general E[f(x)] = f (E[x]).
In summary, a number of methods have been developed
for nonlinear filtering problems:
• Linearization methods: first-order Taylor series expan-
sion (i.e. EKF), and higher-order filter [20], [437].
• Approximation by finite-dimensional nonlinear filters:
Beneˇs filter [33], [34], Daum filter [111]-[113], and pro-
jection filter [202], [55].
• Classic PDE methods, e.g. [282], [284], [285], [505],
[496], [497], [235].
• Spectral methods [312].
• Neural filter methods, e.g. [209].
• Numerical approximation methods, as to be discussed
in Section V.
C.1 Finite-dimensional Filters
The on-line solution of the FPK equation can be
avoided if the unnormalized filtered density admits a finite-
dimensional sufficient statistics. Beneˇs [33], [34] first ex-
plored the exact finite-dimensional filter
32
in the nonlinear
filtering scenario. Daum [111] extended the framework to a
more general case and included Kalman filter and Beneˇsfil-
ter as special cases [113]. Some new development of Daum
filter with virtual measurement was summarized in [113].
The recently proposed projection filters [202], [53]-[57], also
belong to the finite-dimensional filter family.
In [111], starting with SDE filtering theory, Daum intro-
duced a gradient function
r(t, x)=
∂
∂x
ln ψ(t, x)
where ψ(t, x) is the solution of the FPK equation of (11a)
with a form
∂ψ(t, x)
∂t
= −
∂ψ(t, x)
∂x
f − ψtr
∂f
∂x
+
1
2
tr
A
∂
2
ψ
∂xx
T
,
with an appropriate initial condition (see [111]), and A =
σ(t, x
t
)σ(t, x
t
)
T
. When the measurement equation (11b) is
32
Roughly speaking, a finite-dimensional filter is the one that can
be implemented by integrating a finite number of ODE, or the one
has the sufficient statistics with finite variables.