贝叶斯分析：处理线性回归中的偏斜误差尾部厚尾现象

需积分: 9 135 浏览量更新于2024-07-15 收藏 254KB PDF 举报

本文档《On Bayesian Modeling of Fat Tails and Skewness》发表在1998年的《美国统计学会杂志》上，由Carmen Fernández和Mark F.J. Steel两位作者合作完成。文章的核心主题是探讨在贝叶斯分析框架下处理线性回归模型中的尾部厚尾（fat tails）和偏斜（skewness）问题。在实际数据集分析中，这两种特征经常出现，因此研究者提出了一种通用方法来在推断过程中正式考虑这些特性。在统计学中，传统的正态分布假设误差项服从对称且具有有限方差，然而许多现实世界的经济和金融数据往往不符合这一假设，可能存在异常大的值或偏斜的分布。有偏t分布（t-distribution with non-zero degrees of freedom）就是一种适用于这种情况下的一种扩展模型，它不仅允许尾部更厚，还能够适应误差项的不对称性。在Bayesian（贝叶斯）方法中，作者可能采用了非参数或半参数技术，结合了先验知识和观测数据，通过贝叶斯估计和后验分布来估计模型参数，同时也对模型复杂性进行了有效的控制。这包括对偏斜度、尾部厚度等统计量的不确定性进行量化，从而提供了更稳健的预测和决策依据。文中可能涉及的具体步骤可能包括： 1. **数据预处理**：识别并记录数据中的尾部厚尾和偏斜现象，这通常通过统计检验（如Shapiro-Wilk或Kolmogorov-Smirnov test）来实现。 2. **选择合适模型**：基于贝叶斯理论，选择适合有偏t分布或其他非正态分布的线性回归模型，如Bayesian t-regression或者skew-t regression。 3. **设定先验**：为模型参数设定适当的先验分布，考虑到数据的特征，比如尾部的形状和偏斜的程度。 4. **后验计算**：使用贝叶斯法则更新先验分布，结合观测数据计算出后验分布，获取关于参数的精确估计。 5. **模型评估与诊断**：通过模型验证、留出样本或交叉验证来评估模型的性能，并检查模型是否成功捕捉了数据的尾部厚尾和偏斜特征。 6. **应用与解读**：将贝叶斯分析结果应用于实际预测或决策，解释模型的预测能力和不确定性。 Carmen Fernández和Mark Steel两位作者在论文中分享了他们在渔业评估项目（Fishstock Assessment）中的应用，这表明他们的研究不仅限于理论层面，而是直接关联到实际问题解决。通过阅读这篇论文，读者可以了解到如何在贝叶斯框架下处理现实世界数据中的复杂性，以及如何利用这些模型进行更精准的数据分析和决策支持。该研究成果自1998年以来已被引用463次，表明其对后续统计学和经济学研究产生了深远影响。

which is a strictly increasing function of γ, taking values anywhere in (−1, 1). The results

in Arnold and Groeneveld (1995) imply that the latter skewness measure maintains the

convex ordering of distributions introduced by van Zwet (1 964) if f (·) i s diﬀerent iable.

Clearly, we also have SM(ε|γ) = −SM (ε|1/γ) a nd SM (ε|γ = 1) = 0. In contrast to the

skewness coeﬃcients in (2.6) and (2.7) , (2.8) do es not depend on the choice of f(·), a nd

the entire range of this skewness measure can be covered by choosing γ appropriately with

lim

γ→0

SM (ε|γ) = −1 (extreme left skewness) and lim

γ→∞

SM (ε|γ) = 1 (extreme right

skewness).

3. EFFECT OF SKEWNESS ON THE EXIST EN CE OF POS TERIOR MOMENTS

Let us now consider the impact of int roducing skewness into t he sampling distribution

on Bayesian inference in the context of a general regression model. In particular, we

examine the issue of existence of the posterior distribution a nd of its moments.

We shall assume the observables y

∈ ℜ, i = 1, . . . , n, t o be generated from

= g

(β) + σε

, (3.1)

where g

(·) is a known measurable function from ℜ

(k ≥ 1) to ℜ, β = (β

, . . . , β

)

′

∈ ℜ

parameterizes the location and σ ∈ ℜ

is a scale parameter. We assume the error terms

, . . . , ε

to b e i.i.d. given a parameter ν ∈ N ( possibly of inﬁnite dimension) and γ ∈ ℜ

with conditional p.d.f.

p(ε

|ν, γ) =

γ +







[0,∞)

(ε

) + f

(γε

(−∞,0)

(ε

)



, (3.2)

where f

(·) is unimodal and symmetric around zero. This stochasti c a ssumption introduces

two extra parameters into the problem: γ, the skewness parameter, as explained in the

previous Section, and ν whi ch can describe other properties of the sampling distr ibution.

In particular, ν will control the thickness of the tails in the next Section.

We shall adopt the following cl a ss of prior distributions:

(β,σ,ν,γ)

= P

× P

, (3.3)

with P

the usual noninformative distribution characterized by the improper density

p(σ) ∝ σ

−1

(3.4)

on ℜ

, P

is any σ-ﬁnite measure on ℜ

, and P

and P

are proper distributions. An

important special case of (3.3) is where P

is a point mass at 1, which characterizes

symmetry of the error distributio n. In the sequel of this Section, we shall examine the

inﬂuence of allowing for skewness on posterior inference. To this end, we compare posterior

results under a general P

with those where P

is a Dirac measure at 1. For notational

simplicity, we shall denote the latter case by γ = 1.

First of all, since the prior distribution in (3.3) − (3.4) is improper, exist ence of the

posterior distribution needs to be veriﬁed. In addition, our int erest wi ll be focused on

the location and scale parameters β and σ, since ν and γ are merely a uxiliary parameters

to widen the class of sampling distributions. We shall therefore also address t he issue of

existence of posterior moments of β and σ.

剩余26页未读，继续阅读

Quant0xff

粉丝: 1w+
资源: 459

贝叶斯分析：处理线性回归中的偏斜误差尾部厚尾现象

Bayesian Statistical Modeling with Stan, R, and Python.pdf

Bayesian Modeling and Probabilistic Programming in Python.zip

Bayesian_Statistics_Made_Simple.pdf.pdf

Learning Bayesian Networks - Neapolitan R. E..pdf

Bayesian Natural Language Semantics and Pragmatics.pdf

Learning Bayesian networks_ The combination of knowledge.pdf

Kalman_and_Bayesian_Filters_in_Python.pdf

Bayesian Computation with R 带详细目录.pdf

nonparametric bayesian modeling of complex networks

BAYESIAN PROGRAMMING.pdf.pdf

最新资源