高维积分的拟蒙特卡洛估计：广义线性混合模型

需积分: 10 135 浏览量更新于2024-09-18 收藏 271KB PDF 举报

"这篇论文探讨了在广义线性混合模型（GLMMs）中使用拟蒙特卡洛（Quasi-Monte Carlo, QMC）方法进行参数估计的问题。GLMMs是处理长期数据和集群数据的有效工具，但由于其似然函数可能包含高维积分，使得参数估计变得极其复杂。传统的高斯-赫尔米特四分法（Gauss-Hermite Quadrature, GHQ）可以应用，但只适用于低维随机效应。因此，作者提出了一种基于QMC的启发式方法来计算GLMM中的最大似然估计参数。这种方法通过在高维积分域上均匀生成QMC点，以替代GHQ，从而解决高维问题的挑战。" 在广义线性混合模型中，模型结构通常包括固定效应和随机效应两部分。固定效应涉及所有观测值，而随机效应则考虑到数据的嵌套或聚集特性，如时间序列数据或来自不同群体的观测值。由于这些随机效应的存在，GLMM的参数估计涉及到复杂的高维积分，这在计算上非常困难。传统的解决方法是使用Gauss-Hermite四分法，它利用高斯节点和权重来近似积分。然而，随着随机效应维度的增加，GHQ的效率迅速下降，因为需要大量的节点来获得精确的近似，这在计算资源上是不可行的。拟蒙特卡洛方法，作为数值积分的一种策略，通过在均匀分布的点集上执行计算来减少随机性的影响，尤其在高维空间中表现出优于传统蒙特卡洛方法的效率。在QMC方法中，点集不是随机生成的，而是精心设计的，以减少点之间的相关性，从而提高积分的精度。论文提出的启发式方法结合了QMC的优点，用QMC点集替换GHQ中的节点，用于近似GLMM的高维积分。这种做法减少了计算量，并且在处理高维问题时，相比GHQ，可能提供更高效和准确的参数估计。通过这种方法，可以对GLMM中的参数进行最大似然估计，从而更好地理解模型中的变量关系，并进行有效的预测和推断。同时，QMC方法的引入也为解决其他类似复杂统计模型的参数估计问题提供了新的思路。这篇研究论文为GLMMs的参数估计提供了一个实用且高效的解决方案，特别是在处理高维随机效应的情况下。通过QMC方法，可以克服传统方法在计算上的局限性，为大数据集的分析开辟新的途径。

J. Pan, R. Thompson / Computational Statistics & Data Analysis 51 (2007) 5765–5775 5767

where



(; b) ∝





− u)

(u)

du (4)

deﬁnes the conditional log quasi-likelihood of the ﬁxed effects  given the random effects b (Breslow and Clayton,

1993). The likelihood (3) is then maximized with respect to  and  in order to obtain the MLE

 and

.

However, the likelihood L(, ) in (3) in general involves analytically intractable integrals. When the dimension

q of the random effects b is low (e.g., q<5), the GHQ approach is useful for approximating the likelihood (Pan

and Thompson, 2003). As mentioned in Section 1, this technique is, however, no longer helpful for high-dimensional

random effects. The Laplace approximation (Breslow and Clayton, 1993) and EM algorithm (Booth and Hobert, 1999)

were proposed to calculate the MLEs of the parameters  and  in the GLMM. MCMC was also used (Karim and

Zeger, 1992).

3. Quasi-Monte Carlo integration

We propose to use the QMC approach to approximate the integrated quasi-likelihood L(, ) in (3). To gain an

insight into the QMC approach, we ﬁrst brieﬂy review the traditional Monte Carlo (MC) approximation. Suppose f(·)

is an integrable function on the q-dimensional unit cube C

=[0, 1)

. Consider the integral

I(f)=



f(x)dx. (5)

The MC integration draws a random sample P

={x

: 1 k K} from the uniform distribution on C

and then

approximates the integral in (5) through

(f, P

) =



k=1

f(x

). (6)

By the strong law of large numbers the estimate

(f, P

) converges to I(f) with probability one as K →∞.

Moreover, the central limit theorem guarantees that

(f, P

) is asymptotically normally distributed when the sample

size K is large enough. The convergence rate for the MC integration is of the order O(K

−1/2

), regardless of the integral

dimensionality q. On the other hand, the previous statement regarding the convergence of the MC approximation is a

probabilistic one, implying that the MC approximation in general may behave well but for a particular random sample

may lead to a very poor approximation. Hence, it is necessary to make multiple draws of random samples and take the

average of all the approximations, which may be computationally expensive.

The QMC approach aims to improve the MC approximation in terms of faster convergence rate and less computational

load. The key idea is to use integration nodes that are scattered uniformly on C

to replace the MC random samples.

The reason behind this is due to the famous Koksma–Hlawka inequality:

|I(f)−

(f, P

)| V(f)D(P

), (7)

where V(f)is the bounded total variation of f(·) over C

in the sense of Hardy and Krause (Fang and Wang, 1994,

p. 64). The quantity D(P

) measures the evenness of spread of the point set P

, deﬁned by

D(P

) = sup

x∈C

(x) − U(x)|, (8)

where U(x) is the uniform distribution on C

and U

(x) is the empirical distribution of P

. D(P

) in (8) is actually

the Kolmogorov–Smirnov statistic but known as the discrepancy of P

in analytical number theory. The inequality

(7) implies that the absolute error of the integration approximation is bounded and dominated by D(P

) since V(f)

is a constant as far as f(·) is given. Therefore, the points P

with the smallest discrepancy are the best integration

nodes in this sense. It can be shown that the smallest discrepancy is of the order O((log K)

q−1

/K) (Fang and Wang,

1994). Accordingly, the QMC integration has a faster convergence rate than the MC approximation. Unlike the MC

approximation, the QMC integration nodes are deterministic so that multiple draws are not necessary.

剩余10页未读，继续阅读

liu870408

粉丝: 0
资源: 15

高维积分的拟蒙特卡洛估计：广义线性混合模型

Monte Carlo and quasi-Monte Carlo methods

Quasi-Monte Carlo sampling to improve

Monte Carlo and quasi-Monte Carlo sampling methods

On the Use Of Quasi-Monte Carlo Methods in Computational Finance

Quasi-Monte Carlo Sampling

Monte Carlo and Quasi-Monte Carlo Sampling完整版PDF文档.rar

Acceleration of quasi-Monte Carlo approximations

Some Topics on Monte Carlo and Quasi-Monte Carlo Methods (PhD Thesis)

Variance reduction techniques and quasi-Monte Carlo methods ☆

A Randomized Quasi-Monte Carlo Simulation Method for Markov Chains

最新资源