kalmanfilteringinR(R语言实现卡尔曼滤波）_卡尔曼滤波r

kalmanfilter

toolbox

4星 · 超过85%的资源需积分: 49 131 浏览量更新于2023-03-16 评论 4 收藏 842KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

JSS

Journal of Statistical Software

March 2011, Volume 39, Issue 2. http://www.jstatsoft.org/

Kalman Filtering in R

Fernando Tusell

University of the Basque Country

Abstract

Support in R for state space estimation via Kalman ﬁltering was limited to one package,

until fairly recently. In the last ﬁve years, the situation has changed with no less than

four additional packages oﬀering general implementations of the Kalman ﬁlter, including in

some cases smoothing, simulation smoothing and other functionality. This paper reviews

some of the oﬀerings in R to help the prospective user to make an informed choice.

Keywords: state space models, Kalman ﬁlter, time series, R.

1. Introduction

The Kalman ﬁlter is an important algorithm, for which relatively little support existed in R

(R Development Core Team 2010) up until fairly recently. Perhaps one of the reasons is the

(deceptive) simplicity of the algorithm, which makes it easy for any prospective user to throw

in his/her own quick implementation.

While direct transcription of the equations of the Kalman ﬁlter as they appear in many

engineering or time series books may be suﬃcient for some applications, an all-around im-

plementation requires more complex coding. In Section 2, we provide a short overview of

available algorithms. It is against this background that we discuss in Section 3 the particular

choices made by four packages oﬀering fairly general Kalman ﬁltering in R, with some mention

of functions in other packages which cater to particular needs.

Kalman ﬁltering is a large topic. In the sequel we focus on linear Gaussian models and

their estimation, which is what the packages we review oﬀer in common (and the foundation

on which most anything else rests). Other functionalities present in some of the packages

examined include ﬁltering and estimation of non-Gaussian models, simulation and disturbance

smoothing and functions to help with the Bayesian analysis of dynamic linear models, etc.

none of which are assessed.

2 Kalman Filtering in R

2. Kalman ﬁlter algorithms

We shall consider a fairly general state-space model speciﬁcation, suﬃcient for the purpose

of the discussion to follow in Section 3, even if not the most comprehensive. The notation

follows Harvey (1989). Let

= c

+ T

t−1

+ R

(1)

= d

+ Z

+ 

(2)

where η

∼ N(0, Q

) and 

∼ N(0, H

). The state equation (1) describes the dynamics of

the state vector α

, driven by deterministic (c

) and stochastic (η

) inputs. The observation

(or measurement) equation links the observed response y

with the unobserved state vector,

with noise 

and (possibly) deterministic inputs d

. Matrices T

, R

, Z

, Q

and H

may

depend on a vector of parameters, θ, and be time varying or constant over time. The noises η

and 

are assumed serially and also mutually uncorrelated, i.e., E[η



] = 0 for all t, s. The

last assumption and the gaussianity of η

and 

can be dispensed with; see e.g., Anderson

and Moore (1979).

2.1. The Kalman ﬁlter.

The Kalman ﬁlter equations, with slight notational variations, are standard in any textbook:

see, e.g., Anderson and Moore (1979), Simon (2006), Durbin and Koopman (2001), Grewal

and Andrews (2001), West and Harrison (1997) or Shumway and Stoﬀer (2006), to name only

a few. We reproduce those equations here, however, as repeated reference is made to them in

the sequel. Deﬁne

t−1

= E[α

t−1

, . . . , y

t−1

] (3)

t−1

= E[(α

t−1

− a

t−1

)(α

t−1

− a

t−1

)

] ; (4)

estimates of the state vector and its covariance matrix at time t with information available

at time t − 1, a

t|t−1

and P

t|t−1

respectively, are given by the time update equations

t|t−1

= T

t−1

+ c

(5)

t|t−1

= T

t−1

+ R

. (6)

Let F

= Z

t|t−1

+ H

. If a new observation is available at time t, then a

t|t−1

and

t|t−1

can be updated with the measurement update equations

= a

t|t−1

+ P

t|t−1

−1

− Z

t|t−1

− d

) (7)

= P

t|t−1

− P

t|t−1

−1

t|t−1

. (8)

Equations (5)–(6) and (7)–(8) taken together make up the Kalman ﬁlter. Substituting (5)

in (7) and (6) in (8), a single set of equations linking a

t−1

and P

t−1

to a

and P

can be

obtained. In order to start the iteration we need initial values of a

−1

and P

−1

(or a

0|−1

and

0|−1

The ﬁlter equations (6) and (8) propagate the covariance matrix of the state, and are said

to deﬁne a covariance ﬁlter (CF). Equivalent equations can be written which propagate the

matrix P

−1

, giving an information ﬁlter (IF). Information ﬁlters require in general more

Journal of Statistical Software 3

computational eﬀort. One possible advantage is that they provide a natural way to specify

complete uncertainty about the initial value of a component of the state: we can set the

corresponding diagonal term in the information matrix to zero. With a covariance ﬁlter, we

have to set the corresponding variance in P

0|−1

to a “large” number, or else use exact diﬀuse

initialization, an option described below.

Direct transcription of the equations making up the Kalman ﬁlter into computer code is

straightforward. It was soon noticed, though, that the resulting programs suﬀered from

numerical instability; see for instance Bucy and Joseph (1968). In particular, buildup of

ﬂoating point errors in equation (8) may eventually yield non symmetric or non positive

deﬁnite P

matrices. An alternative to equation (8) (more expensive from the computational

point of view) is:

= (I − K

t|t−1

(I − K

)

+ K

(9)

with K

= P

t|t−1

−1

; but even this (“Joseph stabilized form”, see Bucy and Joseph

(1968), p. 175) may prove insuﬃcient to prevent roundoﬀ error degeneracy in the ﬁlter. A

detailed reference describing the pitfalls associated with the numerical implementation of the

Kalman ﬁlter is Bierman (1977); see also Grewal and Andrews (2001), Chap. 6 and Anderson

and Moore (1979), Chap. 6. Square root algorithms, that we discuss next, go a long way in

improving the numerical stability of the ﬁlter.

2.2. Square root algorithms

Consider a matrix S

such that P

= S

; square root covariance ﬁlter (SRCF) algorithms

propagate S

instead of P

with two main beneﬁts (cf. Anderson and Moore (1979), § 6.5):

(i) Re-constitution of P

from S

will always yield a symmetric non negative matrix, and (ii)

The numerical condition of S

will in general be much better than that of P

. If instead of

the covariance matrix we choose to factor the information matrix P

−1

we have a square root

information ﬁlter algorithm (SRIF).

It is easy to produce a replacement for the time update equation (6). Consider an orthogonal

matrix G such that:





= G



t−1

1/2

)



(10)

where M is an upper triangular matrix. Matrix G can be constructed in a variety of ways,

including repeated application of Householder or Givens transforms. Multiplication of the

transpose of (10) by itself produces, taking into account the orthogonality of G,

M = T

t−1

+ R

1/2

)

(11)

= T

t−1

+ R

; (12)

comparison with (6) shows that M is a possible choice for S

t|t−1

. Likewise, it can be shown

(Simon 2006, Section 6.3.4) that the orthogonal matrix G

∗

such that



+ Z

t|t−1

) K

∗

0 M

∗



= G

∗

1/2

)

t|t−1

(13)

produces in the left-hand side of (13) a block M

∗

that can be taken as S

, and thus performs

the measurement update.

4 Kalman Filtering in R

Although the matrices performing the block triangularizations described in equations (10)

and (13) can be obtained quite eﬃciently (Lawson and Hanson 1974; Gentle 2007), clearly

the computational eﬀort is greater than that required by the time and measurement updates

in equations (6) and (8); square root ﬁltering does have a cost.

In the equations above G and G

∗

can be chosen so that M and M

∗

are Cholesky factors

of the corresponding covariance matrices. This needs not be so, and other factorizations

are possible. In particular, the factors in the singular value decomposition of P

t−1

can be

propagated: see for instance Zhang and Li (1996) and Appendix B of Petris et al. (2009). Also,

we may note that time and measurement updates can be merged in a single triangularization

(see Anderson and Moore 1979, p. 148, and Vanbegin and Verhaegen 1989).

2.3. Sequential processing

In the particular case where H

is diagonal, the components of y

= (y

, . . . , y

) are

uncorrelated. We may pretend that we observe one y

at a time and perform p univariate

measurement updates similar to (7)–(8), followed by a time update (5)–(6) – see Durbin and

Koopman (2001, Section 6.4) or Anderson and Moore (1979) for details.

The advantage of sequential processing is that F

becomes 1 ×1 and the inversion of a p ×p

matrix in equation (7)–(8) is avoided. Clearly, the situation where we stand to gain most

from this strategy is when the dimension of the observation vector y

is large.

Sequential processing can be combined with square root covariance and information ﬁlters,

although in the latter case the computational advantages seem unclear (Anderson and Moore

1979, p. 142; see also Bierman 1977).

Aside from the case where H

is diagonal, sequential processing may also be used when H

is block diagonal – in which case we can perform a sequence of reduced dimension updates

– or when it can be reduced to diagonal form by a linear transformation. In the last case,

assuming full rank, let H

−1/2

be a square root of H

−1

. Multiplying (2) through by H

−1/2

we get

∗

= d

∗

+ Z

∗

+ 

∗

(14)

with E[

∗



∗

] = I. If matrix H

is time-invariant, the same linear transformation will

decorrelate the measurements at all times.

2.4. Smoothing and the simulation smoother

The algorithms presented produce predicted a

t|t−1

or ﬁltered a

values of the state vector

. Sometimes it is of interest to estimate a

t|N

for 0 < t ≤ N, i.e., E[α

, . . . , y

], the

value of the state vector given all past and future observations. It turns out that this can

be done running once the Kalman ﬁlter and then a recursion backwards in time (Durbin and

Koopman 2001, Section 4.3, Harvey 1989, Section 3.6).

In some cases, and notably for the Bayesian analysis of the state space model, it is of interest

to generate random samples of state and disturbance vectors, conditional on the observations

, . . . , y

. Fr

uhwirth-Schnatter (1994) and Carter and Kohn (1994) provided algorithms to

that purpose, improved by de Jong (1995). Durbin and Koopman (2001, Section 4.7) draws

on work of the last author; the algorithm they present is fairly involved. Recently, Durbin

and Koopman (2002) have provided a much simpler algorithm for simulation smoothing of

the Gaussian state space model; see also Strickland et al. (2009).

Journal of Statistical Software 5

2.5. Exact diﬀuse initial conditions

As mentioned above, the Kalman ﬁlter iteration needs starting values a

−1

and P

−1

. When

nothing is known about the initial state, a customary practice has been to set P

−1

with

“large” elements along the main diagonal, to reﬂect our uncertainty. This practice has been

criticized on the ground that

“While [it can] be useful for approximate exploratory work, it is not recommended

for general use, since it can lead to large rounding errors.”

(Durbin and Koopman 2001, p. 101). Grewal and Andrews (2001, Section 6.3.2) show how

this may come about when elements of P

−1

are set to values “too large” relative to the mea-

surement variances. An alternative is to use an information ﬁlter with the initial information

matrix (or some diagonal elements of it) set to zero. Durbin and Koopman (2001, Section 5.3)

advocate a diﬀerent approach: see also Koopman and Durbin (2003) and Koopman (1997).

The last reference discusses several alternatives and their respective merits.

2.6. Maximum likelihood estimation

An important special case of the state space model is that in which some or all of the matrices

, R

, Z

, Q

and H

in equations (1)–(2) are time invariant and depend on a vector of

parameters, θ that we seek to estimate.

Assuming α

∼ N (a

, P

) with both a

and P

known, the log likelihood is given by,

L(θ) = log p(y

, . . . , y

|θ)

= −

(N + 1)p

log(2π) −

t=0



log |F

| + e

−1



(15)

where e

= y

− Z

. (Durbin and Koopman 2001, Section 7.2; the result goes back to

Schweppe 1965). Except for |F

|, all other quantities in (15) are computed when running

the Kalman ﬁlter (cf. equations (5)–(8)); therefore, the likelihood is easily computed. (If

we resort to sequential processing, Section 2.3, the analog of equation (15) does not require

computation of determinants nor matrix inversions.)

Maximum likelihood estimation can therefore be implemented easily: it suﬃces to write a

routine computing (15) as a function of the parameters θ and use R functions such as optim

or nlminb to perform the optimization. An alternative is to use the EM algorithm which quite

naturally adapts to this likelihood maximization problem, but that approach has generally

been found slower than the use of quasi-Newton methods, see for instance Shumway and

Stoﬀer (2006), p. 345.

3. Kalman ﬁltering in R

There are several packages available from the Comprehensive R Archive Network (CRAN)

oﬀering general Kalman ﬁlter capabilities, plus a number of functions scattered in other

packages which cater to special models or problems. We describe ﬁve of those packages in

chronological order of ﬁrst appearance on CRAN.

剩余26页未读，继续阅读

weixin_40508312

2019-10-06

好实用的资源！

feiyihexin

粉丝: 26
资源: 1

会员权益专享

kalman filtering in R(R语言实现卡尔曼滤波）

评论2

会员权益专享

最新资源

kalman filtering in R(R语言实现卡尔曼滤波）

评论2

卡尔曼滤波（Kalman filter）

基础卡尔曼滤波程序线性问题

卡尔曼滤波(KALMAN）

自适应卡尔曼滤波和普通卡尔曼滤波的区别

联邦卡尔曼滤波和扩展卡尔曼滤波的区别

分布式卡尔曼滤波和联邦卡尔曼滤波

非差非组合卡尔曼滤波

自适应卡尔曼滤波c++

序贯滤波对比卡尔曼滤波

C语言实现 卡尔曼滤波

srukf.zip无迹卡尔曼滤波csdn

deepsort卡尔曼滤波

卡尔曼滤波 CA 模型

方差补偿自适应卡尔曼滤波matlab

卡尔曼滤波通信matlab

分布式卡尔曼滤波matlab

slam中卡尔曼滤波

matlab自适应卡尔曼滤波

simulink扩展卡尔曼滤波

会员权益专享

最新资源

C语言实现卡尔曼滤波