贝叶斯滤波：从卡尔曼滤波到粒子滤波

需积分: 33 95 浏览量更新于2024-07-20 收藏 1.19MB PDF 举报

"这篇文章是关于贝叶斯滤波的综述，从卡尔曼滤波到粒子滤波，深入探讨了贝叶斯滤波的理论及其在不同情况下的应用。" 贝叶斯滤波是一种统计学方法，用于在不断变化的环境中估计系统状态，尤其在处理非线性和非高斯噪声的情况下表现出色。它基于贝叶斯定理，这是一种概率推理方法，允许我们根据新的证据更新对事件发生概率的信念。贝叶斯滤波的核心在于将当前观测数据与先验知识相结合，以提供对系统状态的最佳估计。文章首先简要回顾了随机滤波理论，特别是非线性非高斯滤波。非线性滤波在处理不能用线性模型描述的问题时显得尤为重要，而非高斯滤波则处理的是不符合正态分布的噪声。在线性二次高斯（LQG）情况下，经典的卡尔曼滤波器是贝叶斯框架内的最优解决方案。卡尔曼滤波器以其效率和准确性著称，被广泛应用于导航、控制系统和信号处理等领域。它通过预测和更新两个步骤，实现了对动态系统状态的最优估计。对于更复杂的非线性系统，文章深入研究了各种最优或次优的非线性滤波技术。其中，基于序列蒙特卡洛采样的贝叶斯滤波方法，即粒子滤波，得到了特别关注。粒子滤波通过模拟大量随机样本（或“粒子”）来近似后验概率分布，从而适应非线性和非高斯环境。文章讨论了不同类型的粒子滤波器，包括简单粒子滤波、重采样策略、自适应粒子滤波等，并分析了它们的优点和局限性。此外，文章还涉及了粒子滤波中的关键问题，如粒子退化、效率优化和权重分配等。这些细节对于理解和改进粒子滤波算法至关重要。同时，作者还探索了贝叶斯滤波的其他新方向，可能包括更先进的采样技术、并行计算和近似方法等。这篇综述论文提供了全面的贝叶斯滤波理论知识，涵盖了从基础到前沿的研究成果，对于理解和应用这一领域的知识非常有帮助。无论是研究者还是实践者，都能从中受益匪浅，深化对贝叶斯滤波及其在实际问题中应用的理解。

MANUSCRIPT 12

n,n−1

is the state-error vector. Denoting the covariance of

n,n−1

by P

n,n−1

, by Gaussian assumption, we may obtain

p(x

n−1

)=A

exp



−

n|n−1

)

×P

−1

n,n−1

−

n|n−1

)



, (34)

where A

=(2π)

−N

n,n−1

−1/2

. By substituting equa-

tions (31) and (34) to (26), it further follows

p(x

) ∝ A exp



−

− G

)

−1

− G

)

−

n|n−1

)

−1

n,n−1

−

n|n−1

)



(35)

where A = A

is a constant. Since the denominator is

a normalizing constant, (35) can be regarded as an unnor-

malized density, the fact doesn’t aﬀect the following deriva-

tion.

Since the MAP estimate of the state is deﬁned by the

condition

∂log p(x

)

∂x



MAP

=0, (36)

substituting equation (35) into (36) yields

MAP



−1

+ P

−1

n,n−1



−1



−1

n,n−1

n|n−1

+ G

−1



By using the lemma of inverse matrix,

it is simpliﬁed as

MAP

n|n−1

+ K

− G

n|n−1

), (37)

where K

is the Kalman gain as deﬁned by

= F

n+1,n

n,n−1

+Σ

)

−1

. (38)

Observing

n,n−1

= x

−

n|n−1

= F

n,n−1

n−1

+ d

− F

n,n−1

MAP

n−1

= F

n,n−1

MAP

n−1

+ d

n−1

, (39)

and by virtue of P

n−1

=Cov[e

MAP

n−1

], we have

n,n−1

=Cov[e

n,n−1

]

= F

n,n−1

n−1

n,n−1

+Σ

. (40)

Since

= x

−

MAP

= x

− x

n|n−1

− K

− G

n|n−1

), (41)

For A = B

−1

+ CD

−1

, it follows from the matrix inverse

lemma that A

−1

= B − BC(D + C

BC)

−1

noting that e

n,n−1

= x

−

n|n−1

and y

= G

+ v

we further have

= e

n,n−1

− K

n,n−1

+ v

)

=(I −K

n,n−1

− K

, (42)

and it further follows

=Cov[e

MAP

]

=(I −K

n,n−1

(I −K

)

+ K

Rearranging the above equation, it reduces to

= P

n,n−1

− F

n,n+1

n,n−1

. (43)

Thus far, the Kalman ﬁlter is completely derived from

MAP principle, the expression of x

MAP

is exactly the same

solution derived from the innovations framework (or oth-

ers).

The above procedure can be easily extended to ML case

without much eﬀort [384]. Suppose we want to maximize

the marginal maximum likelihood of p(x

), which is

equivalent to maximizing the log-likelihood

log p(x

) = log p(x

, Y

) −log p(Y

), (44)

and the optimal estimate near the solution should satisfy

∂log p(x

)

∂x



=0. (45)

Substituting (35) to (45), we actually want to minimize the

the cost function of two combined Mahalanobis norms

E = y

− G



−1

+ x

−



−1

n,n−1

. (46)

Taking the derivative of E with respect to x

and setting

as zero, we also obtain the same solution as (37).

Remarks:

• The derivation of the Kalman-Bucy ﬁlter [249] was

rooted in the SDE theory [387], [360], it can be also

derived within the Bayesian framework [497], [248].

• The optimal ﬁltering solution described by Wiener-

Hopf equation is achieved by spectral factorization

technique [487]. By admitting state-space formula-

tion, Kalman ﬁlter elegantly overcomes the station-

arity assumption and provides a fresh look at the

ﬁltering problem. The signal process (i.e.“state”)

is regarded as a linear stochastic dynamical system

driven by white noise, the optimal ﬁlter thus has

a stochastic diﬀerential structure which makes the

recursive estimation possible. Spectral factorization

is replaced by the solution of an ordinary diﬀeren-

tial equation (ODE) with known initial conditions.

Wiener ﬁlter doesn’t treat the diﬀerence between the

white and colored noises, it also permits the inﬁnite-

dimensional systems; whereas Kalman ﬁlter works for

The Mahalanobis norm is deﬁned as a weighted norm: A

BA.

MANUSCRIPT 13

ﬁnite-dimensional systems with white noise assump-

tion.

• Kalman ﬁlter is an unbiased minimum variance estima-

tor under LOG circumstance. When the Gaussian as-

sumption of noise is violated, Kalman ﬁlter is still opti-

mal in a mean square sense, but the estimate doesn’t

produce the condition mean (i.e. it is biased), and

neither the minimum variance. Kalman ﬁlter is not

robust because of the underlying assumption of noise

density model.

• Kalman ﬁlter provides an exact solution for linear

Gaussian prediction and ﬁltering problem. Concerning

the smoothing problem, the oﬀ-line estimation version

of Kalman ﬁlter is given by the Rauch-Tung-Striebel

(RTS) smoother [384], which consists of a forward ﬁl-

ter in a form of Kalman ﬁlter and a backward recursive

smoother. The RTS smoother is computationally eﬃ-

cient than the optimal smoother [206].

• The conventional Kalman ﬁlter is a point-valued ﬁl-

ter, it can be also extended to set-valued ﬁltering [39],

[339], [80].

• In the literature, there exists many variants of Kalman

ﬁlter, e.g., covariance ﬁlter, information ﬁlter, square-

root Kalman ﬁlters. See [205], [247] for more details

and [403] for a unifying review.

C. Optimum Nonlinear Filtering

In practice, the use of Kalman ﬁlter is limited by the

ubiquitous nonlinearity and non-Gaussianity of physical

world. Hence since the publication of Kalman ﬁlter, numer-

ous eﬀorts have been devoted to the generic ﬁltering prob-

lem, mostly in the Kalman ﬁltering framework. A number

of pioneers, including Zadeh [503], Bucy [61], [60], Won-

ham [496], Zakai [505], Kushner [282]-[285], Stratonovich

[430], [431], investigated the nonlinear ﬁltering problem.

See also the papers seeking optimal nonlinear ﬁlters [420],

[289], [209]. In general, the nonlinear ﬁltering problem per

sue consists in ﬁnding the conditional probability distribu-

tion (or density) of the state given the observations up to

current time [420]. In particular, the solution of nonlinear

ﬁltering problem using the theory of conditional Markov

processes [430], [431] is very attractive from Bayesian per-

spective and has a number of advantages over the other

methods. The recursive transformations of the posterior

measures are characteristics of this theory. Strictly speak-

ing, the number of variables replacing the density function

is inﬁnite, but not all of them are of equal importance.

Thus it is advisable to select the important ones and reject

the remainder.

The solutions of nonlinear ﬁltering problem have two cat-

egories: global method and local method. In the global ap-

proach, one attempts to solve a PDE instead of an ODE

in linear case, e.g. Zakai equation, Kushner-Stratonovich

equation, which are mostly analytically intractable. Hence

the numerical approximation techniques are needed to solve

the equation. In special scenarios (e.g. exponential family)

with some assumptions, the nonlinear ﬁltering can admit

the tractable solutions. In the local approach, ﬁnite sum

approximation (e.g. Gaussian sum ﬁlter) or linearization

techniques (i.e. EKF) are usually used. In the EKF, by

deﬁning

n+1,n

df(x)



dg(x)



n|n−1

the equations (2a)(2b) can be linearized into (3a)(3b), and

the conventional Kalman ﬁltering technique is further em-

ployed. The details of EKF can be found in many books,

e.g. [238], [12], [96], [80], [195], [205], [206]. Because EKF

always approximates the posterior p(x

0:n

) as a Gaus-

sian, it works well for some types of nonlinear problems,

but it may provide a poor performance in some cases when

the true posterior is non-Gaussian (e.g. heavily skewed or

multimodal). Gelb [174] provided an early overview of the

uses of EKF. It is noted that the estimate given by EKF is

usually biased since in general E[f(x)] = f (E[x]).

In summary, a number of methods have been developed

for nonlinear ﬁltering problems:

• Linearization methods: ﬁrst-order Taylor series expan-

sion (i.e. EKF), and higher-order ﬁlter [20], [437].

• Approximation by ﬁnite-dimensional nonlinear ﬁlters:

Beneˇs ﬁlter [33], [34], Daum ﬁlter [111]-[113], and pro-

jection ﬁlter [202], [55].

• Classic PDE methods, e.g. [282], [284], [285], [505],

[496], [497], [235].

• Spectral methods [312].

• Neural ﬁlter methods, e.g. [209].

• Numerical approximation methods, as to be discussed

in Section V.

C.1 Finite-dimensional Filters

The on-line solution of the FPK equation can be

avoided if the unnormalized ﬁltered density admits a ﬁnite-

dimensional suﬃcient statistics. Beneˇs [33], [34] ﬁrst ex-

plored the exact ﬁnite-dimensional ﬁlter

in the nonlinear

ﬁltering scenario. Daum [111] extended the framework to a

more general case and included Kalman ﬁlter and Beneˇsﬁl-

ter as special cases [113]. Some new development of Daum

ﬁlter with virtual measurement was summarized in [113].

The recently proposed projection ﬁlters [202], [53]-[57], also

belong to the ﬁnite-dimensional ﬁlter family.

In [111], starting with SDE ﬁltering theory, Daum intro-

duced a gradient function

r(t, x)=

∂

∂x

ln ψ(t, x)

where ψ(t, x) is the solution of the FPK equation of (11a)

with a form

∂ψ(t, x)

∂t

= −

∂ψ(t, x)

∂x

f − ψtr



∂f

∂x





∂

∂xx



with an appropriate initial condition (see [111]), and A =

σ(t, x

)σ(t, x

)

. When the measurement equation (11b) is

Roughly speaking, a ﬁnite-dimensional ﬁlter is the one that can

be implemented by integrating a ﬁnite number of ODE, or the one

has the suﬃcient statistics with ﬁnite variables.

MANUSCRIPT 14

linear with Gaussian noise (recalling the discrete-time ver-

sion (3b)), Daum ﬁlter admits a ﬁnite-dimensional solution

p(x

)=ψ

)exp



− m

)

−1

− m

)



where s is real number in the interval 0 <s<1 deﬁned in

the initial condition, m

and P

are two suﬃcient statis-

tics that can be computed recursively.

The calculation of

ψ(x

) can be done oﬀ line which does not rely on the mea-

surement, whereas m

and P

will be computed on line

using numerical methods. See [111]-[113] for more details.

The problem of the existence of a ﬁnite-dimensional ﬁl-

ter is concerned with the necessary and suﬃcient condi-

tions. In [167], a necessary condition is that the obser-

vations and the ﬁltering densities belong to the exponen-

tial class. In particular, we have the Generalized Fisher-

Darmois-Koopman-Pitman Theorem:

Theorem 1: e.g. [388], [112] For smooth nowhere vanish-

ing densities, a ﬁxed ﬁnite-dimensional ﬁlter exists if and

only if the unnormalized conditional density is from an ex-

ponential family

π(x

0:n

)=π(x

)exp[λ

)Ψ(y

0:n

)], (47)

where Ψ(·) is a suﬃcient statistics, λ(·) is a function in X

(which turns out to be the solution of speciﬁc PDE’s).

The nonlinear ﬁnite-dimensional ﬁltering is usually per-

formed with the conjugate approach, where the prior and

posterior are assumed to come from some parametric prob-

ability function family in order to admit the exact and ana-

lytically tractable solution. We will come back to this topic

in Section VII. On the other hand, for general nonlinear

ﬁltering problem, no exact solution can be obtained, vari-

ous numerical approximation are hence need. In the next

section, we brieﬂy review some popular numerical approxi-

mation approaches in the literature and focus our attention

on the sequential Monte Carlo technique.

V. Numerical Approximation Methods

A. Gaussian/Laplace Approximation

Gaussian approximation is the simplest method to ap-

proximate the numerical integration problem because of its

analytic tractability. By assuming the posterior as Gaus-

sian, the nonlinear ﬁltering can be taken with the EKF

method.

Laplace approximation method is to approximate the in-

tegral of a function



f(x)dx by ﬁtting a Gaussian at the

maximum

x of f (x), and further compute the volume un-

der the Gaussian [319]:



f(x)dx ≈ (2π)



−∇∇log f(x)



−1/2

(48)

The covariance of the ﬁtted Gaussian is determined by the

Hessian matrix of log f(x)at

x. It is also used to approxi-

mate the posterior distribution with a Gaussian centered at

They degenerate into the mean and error covariance when (11a)

is linear Gaussian, and the ﬁlter reduces to the Kalman-Bucy ﬁlter.

the MAP estimate, which is partially justiﬁed by the fact

that under certain regularity conditions the posterior dis-

tribution asymptotically approaches Gaussian distribution

as the number of samples increases to inﬁnity. Laplace ap-

proximation is useful in the MAP or ML framework, this

method usually works for the unimodal distribution but

produces a poor approximation result for the multimodal

distribution, especially in a high-dimensional space. Some

new development of Laplace approximation can be found

in MacKay’s paper [319].

B. Iterative Quadrature

Iterative quadrature is an important numerical approxi-

mation method, which was widely used in computer graph-

ics and physics in the early days. One of the popular

quadrature methods is Gaussian quadrature [117], [377]. In

particular, a ﬁnite integral is approximated by a weighted

sum of samples of the integrand based on some quadrature

formula



f(x)p(x)dx ≈



k=1

f(x

), (49)

where p(x) is treated as a weighting function, and x

the quadrature point. For example, it can be the k-th zero

the m-th order orthogonal Hermite polynomial H

(x),

for which the weights are given by

m−1

√

m−1

))

The approximation is good if f(x) is a polynomial of de-

gree not bigger than 2m−1. The values x

are determined

by the weighting function p(x) in the interval [a, b].

This

method can produce a good approximation if the nonlinear

function is smooth. Quadrature methods, alone or com-

bined with other methods, were used in nonlinear ﬁltering

(e.g. [475], [287]). The quadrature formulae will be used

after a centering about the current estimate of the condi-

tional mean and rescaling according to the current estimate

of the covariance.

C. Mulitgrid Method and Point-Mass Approximation

If the state is discrete and ﬁnite (or it can be discretized

and approximated as ﬁnite), grid-based methods can pro-

vide a good solution and optimal way to update the ﬁltered

density p(z

0:n

) (To discriminate from the continuous-

valued state x, we denote the discrete-valued state as z

from now on). Suppose the discrete state z ∈ N consists

of a ﬁnite number of distinct discrete states {1, 2, ···,N

For the state space z

n−1

,letw

n−1|n−1

denote the condi-

tional probability of each z

n−1

given measurement up to

Other orthogonal approximation techniques can be also consid-

ered.

The Fundamental Theorem of Gaussian Quadrature states that:

the abscissas of the m-point Gaussian quadrature formula are pre-

cisely the roots of the orthogonal polynomial for the same interval

and weighting function.

剩余68页未读，继续阅读

其官方

粉丝: 0
资源: 1

贝叶斯滤波：从卡尔曼滤波到粒子滤波

贝叶斯滤波与平滑 bayesian filtering and smoothing

贝叶斯滤波算法Visual C++实现

贝叶斯滤波和卡尔曼滤波.pptx

虚拟现实和增强现实之传感器融合算法：贝叶斯滤波：高级贝叶斯滤波与粒子滤波.docx

虚拟现实和增强现实之传感器融合算法：贝叶斯滤波：基于贝叶斯滤波的定位与跟踪.docx

贝叶斯滤波matlab

贝叶斯滤波与平滑

bayes-filters贝叶斯滤波

贝叶斯滤波大论文1

贝叶斯滤波与卡尔曼滤波的区别.doc

最新资源