The Unscented Kalman Filter for Nonlinear Estimation
Eric A. Wan and Rudolph van der Merwe
Oregon Graduate Institute of Science & Technology
20000 NW Walker Rd, Beaverton, Oregon 97006
ericwan@ece.ogi.edu, rvdmerwe@ece.ogi.edu
Abstract
The Extended Kalman Filter (EKF) has become a standard
technique used in a number of nonlinear estimation and ma-
chine learning applications. These include estimating the
state of a nonlinear dynamic system, estimating parame-
ters for nonlinear system identification (e.g., learning the
weights of a neural network), and dual estimation (e.g., the
Expectation Maximization (EM) algorithm) where both states
and parameters are estimated simultaneously.
This paper points out the flaws in using the EKF, and
introduces an improvement, the Unscented Kalman Filter
(UKF), proposed by Julier an d Uhlman [5]. A central and
vital operation performed in the Kalm a n Filter is the prop-
agation of a Gaussian random variable (GRV) through the
system dynamics. In the EKF, the state distribution is ap-
proximated by a GRV, which is then propagated analyti-
cally through the first-order linearization of the nonlinear
system. This can introduce large err ors in the true posterior
mean and covariance of the transformed GRV, which may
lead to sub-optimal performance and sometimes divergence
of the filter. The UKF addresses this problem by using a
deterministic sampling approach. The state distribution is
again approximated by a GRV, but is now represented using
a minimal set of carefully chosen sample points. These sam-
ple points completely capture the true mean and covariance
of the GRV, and wh e n propagated through the true non-
linear system, cap tu res the posterior mean and covariance
accurately to the 3rd order (Taylo r series expansion) for any
nonlinearity. The EKF, in contrast, only achieves first-order
accuracy. Remarkably, the computational complexity of the
UKF is the same order as that of the EKF.
Julier and Uhlman demonstrated the substantial perfor-
mance gains of the UKF in the context of state-estimation
for nonlinear control. Machine learning problems were not
considered. We extend the use of the UKF to a broader class
of nonlinear estimation problems, including nonlinear sys-
tem identification, training of neural networks, an d dual es-
timation problems. Our preliminary results were presented
in [13]. In this paper, the algorithms are further developed
and illustrated with a number of additional examples.
This work was sponsored by the NSF under grant grant IRI-9712346
1. Introduction
The EKF has been applied extensively to the field of non-
linear estimation. General application areas may be divided
into state-estimation and machine learning. We further di-
vide machine learning into parameter estimation and dual
estimation. The framework for these areas ar e briefly re-
viewed next.
State-estimation
The basic framework for the EKF involves estimation of the
state of a discrete-time nonlinear dynamic system,
(1)
(2)
where represent the unobserved state of the system and
is the only observed signal. The process noise drives
the dynamic system, and the observation noise is given by
. Note that we are not assuming add itivity of the noise
sources. The system dynamic model and are assumed
known. In state-estimation, the EKF is the standard method
of choice to achieve a recursive (approximate) maximum-
likelihood estimation of the state . We will review the
EKF itself in this context in Section 2 to help motivate the
Unscented Kalman Filter (UKF).
Parameter Estimation
The classic machine learning problem involves determining
a nonlinear mapping
(3)
where is the input, is the output, and the nonlinear
map is parameterized by the vector . The nonlinear
map, for example, may be a feedforward or recurrent neural
network ( are the weights), with numerous applications
in regression, classification, and dynamic modeling. Learn-
ing corresponds to estimating the parameters . Typically,
a training set is provided with sample pairs consisting of
known input and desired outputs, . The err or of
the machine is defined as , and the
goal of learning involves solving for the parameters in
order to minimize the expected squared err or.