R包splm：空间面板数据模型的统计软件详解

需积分: 11 51 浏览量更新于2024-07-19 1 收藏 679KB PDF 举报

空间统计学是一门研究空间数据之间相互关系的统计方法，特别关注于面板数据模型在空间经济学中的应用。在这个讲义中，作者重点介绍了名为`splm`的R包，它专为估计和测试各种空间面板数据模型而设计。这个软件包支持最大似然估计（Maximum Likelihood Estimation, MLE）和广义矩估计（Generalized Method of Moments, GMM），适用于固定效应和随机效应的空间面板数据分析。 `splm`包的开发人员是Giovanni Millo来自Generali SpA和Gianfranco Piras来自West Virginia University。他们提供了对经典案例——Munnell(1990)关于48个美国州17年间生产力数据的研究——的深入分析，通过这个实例展示了包的功能和优势。讲义旨在向潜在用户展示如何利用`splm`进行空间面板数据建模，以及与现有其他软件（如没有直接对比的部分）进行比较的方法。在空间面板数据分析中，常见的方法论有：Anselin, Le Gallo, 和 Jayet (2008)的工作，以及Kapoor, Kelejian, 和 Prucha (具体年份)的贡献。这些研究者可能探讨了空间自相关性、空间误差结构、以及如何处理地理空间数据中的特殊性质，如邻接矩阵或空间权重矩阵。 `splm`包的关键词包括空间面板数据、最大似然估计、广义矩检验（GM tests）、局部莫尔斯检验（LM tests）以及R语言的应用。这表明该讲义不仅涵盖了理论基础，还提供了实际操作层面的指导，使得研究人员能够有效地利用`splm`进行空间经济问题的研究和分析。通过学习这个讲义，读者可以掌握如何处理空间面板数据集中的复杂性，理解不同模型的选择，以及如何进行有效的统计推断。这对于在经济地理学、城市规划、环境科学等领域工作的专业人士来说，是一项重要的工具和知识补充。

Journal of Statistical Software 7

we consider the implementation of both error term speciﬁcations. For the ﬁrst speciﬁcation,

we implement maximum likelihood estimation of the random as well as the ﬁxed eﬀects

models. For the second (simpler) speciﬁcation, we implement both maximum likelihood and

instrumental variables estimation under the random as well as the ﬁxed eﬀects assumption.

The next section is devoted to the discussion of the ML implementation of the two models

and Section 6 to the GM implementation of the second error speciﬁcation.

5. ML implementation

Both random and ﬁxed eﬀects models are implemented within the same software framework.

spml is the general wrapper function and the argument model controls the speciﬁcation. In

accordance with the syntax in plm, model takes up the value "within" for ﬁxed eﬀects,

"random" for random eﬀects, and "pooling" for no eﬀects. The spatial structure is speciﬁed

by combining the logical arguments lag (that, if true, adds a spatial autoregressive term in

the dependent variable) and spatial.error. This last argument takes three possible values:

"b" (“Baltagi”) for the speciﬁcation in Equation 3, "kkp" (“Kapoor, Kelejian and Prucha”)

for the speciﬁcation in Equation 7, and "none" for no spatial error correlation.

5.1. Random eﬀects model

For a model with spatially autocorrelated error components, ordinary least squares (OLS) is

ineﬃcient even when σ

= 0. Analogously, OLS on a random eﬀects model (even without

spatial components) is also ineﬃcient. An alternative (i.e., more eﬃcient) way of estimat-

ing the model is via maximum likelihood. In the present section we discuss the estimation

approach of the full speciﬁcation, i.e., the one with a spatial lag, random eﬀects and spatial

correlation of the form speciﬁed in Equation 3.

Scaling the error covariance matrix by the idiosyncratic error variance σ

, and denoting

φ = σ

/σ

= J

/T , E

= I

−

and A

= (I

−λW

), the expressions for the scaled

error covariance matrix Σ, its inverse Σ

−1

, and its determinant |Σ| can be written respectively

Σ = φ(J

⊗ I

) + I

⊗ (B

−1

⊗ ((T φI

+ (B

−1

)

−1

+ E

⊗ B

|Σ| = |T φI

+ (B

−1

||(B

−1

T −1

Substituting into the general formula given in Anselin (1988, Ch. 6), one can derive the

expression of the likelihood:

L(β, σ

, φ, λ, ρ) = −

2π −

ln σ

+ T ln |A|

−

ln |TφI

+ (B

−1

+ (T − 1) ln |B| −

2σ

−1

We implement an iterative procedure to obtain the maximum likelihood estimates. Starting

from initial values for λ, ρ and φ, we obtain estimates for β and σ

from the ﬁrst order

conditions:

β = (X

−1

= (Ay − Xβ)

−1

(Ay − Xβ)/NT.

8 splm: Spatial Panel Data Models in R

The likelihood can be concentrated and maximized with respect to λ, ρ and φ. The estimated

values of λ, ρ and φ are in turn used to update the expression for A and Σ

−1

. These steps

are then repeated until a convergence criterion is met. In other words, for a speciﬁc Σ the

estimation can be operationalized by a two step iterative procedure that alternates between

generalized least squares (GLS, for β and σ

) and concentrated likelihood (for the remaining

parameters) until convergence.

From an implementation point of view there are (at least) a

couple of diﬀerent ways to proceed. First of all, we decided to include the GLS step within

the objective function to be maximized (i.e., the function to be used as an argument to the

optimizer). In other words, the GLS step is part of the optimization process of the likelihood.

We obtain standard errors for β from GLS, and we employ a numerical Hessian to perform

statistical inference on the error components.

Illustration

ML estimation of spatial panel random eﬀects models is performed by spml with the argument

model set to "random". The arguments lag and spatial.error allow the estimation of all

combinations of a spatial lag with the diﬀerent speciﬁcations for the error term. The same

speciﬁcations but without random eﬀects can be estimated by setting the model to "pooling".

It should be noted that the effects argument can only be set to "individual" in the random

eﬀects context, and it will turn out to be more useful when discussing ﬁxed eﬀects models.

As for other speciﬁc parameters, we provide two ways to set the initial values of the parameters

managed through the optional argument initval.

The ﬁrst option is to specify a numeric

vector of initial values. As an alternative, when initval is set to "estimate" the initial values

are retrieved from the estimation of nested speciﬁcations. As an example, when estimating the

full model, the initial value for the spatial correlation parameter is taken to be the estimated

ρ from a panel regression with spatially correlated errors. Analogously, the initial value of

λ is the estimated spatial autocorrelation coeﬃcient from the spatial autoregressive model;

and, ﬁnally, an initial value for φ is obtained by estimating a random eﬀects model.

Assuming that both the spatial lag and the spatial error are deﬁned according to the same

weights matrix, Munnell’s data lead to the following results for the most general model:

R> sararremod <- spml(formula = fm, data = Produc, index = NULL,

+ listw = usalw, model = "random", lag = TRUE, spatial.error = "b")

R> summary(sararremod)

Spatial panel random effects ML model

Call:

spml(formula = fm, data = Produc, index = NULL, listw = usalw,

model = "random", lag = TRUE, spatial.error = "b")

Note that these steps remain valid when the model to be estimated is one of the nested speciﬁcations

where, for example, one of the spatial coeﬃcients is restricted to zero.

There are many optimizers available under R. Our ﬁnal choice was to use nlminb. While leading to similar

values for the estimated parameters, it proved to be faster than other optimizers.

The numerical Hessian is implemented in the function fdHess available from nlme. The Hessian is evaluated

at the ML parameter values using ﬁnite diﬀerences.

If none of the two options is speciﬁed, the optimization will start at zero.

剩余37页未读，继续阅读

qq_41989990

粉丝: 1
资源: 1

R包splm：空间面板数据模型的统计软件详解

空间统计学的MATLAB工具箱

空间统计分析方法

第七讲景观空间统计学方法.pdf

回归分析，空间权重矩阵

arcgis空间自相关分析代码

空间杜宾模型用GeoDa怎么做

请写一篇基于arcgis对银行网点空间分布的研究的论文

kriging模型的原理以及优势特点

python kriging插值

克里金插值法输入输出维数不同

最新资源