参数回归模型中区间 censoring 的研究与应用

Lifetime

需积分: 9 79 浏览量更新于2024-07-16 收藏 372KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

A STUDY OF INTERVAL CENSORING 333

In a regression problem, the location parameter, µ, is usually allowed to vary with the

conditions, x

, under study. Here, I shall use the log link function,

log(µ

) = g

, β

)

for the ﬁrst four densities above, and the identity link function,

= g

, β

)

for the ﬁve ‘logged’ densities, where g

(·) is some general regression function that may be

nonlinear in the parameters. Note that, in the ﬁrst case, µ refers to y and, in the second

case, to log(y). Other link functions are also possible.

I shall also consider two other, less common, regression models, each in addition to

the regression equation for the location parameter, µ. These models, thus, contain two

regression equations (Lindsey, 1974b). The ﬁrst extension allows the dispersion parameter,

φ, to vary with the conditions:

log(φ

) = g

, β

)

The second extensioninvolvesa ﬁnitemixture whereby either the left or the right censored

observations may come from a mixture of two populations (Boag, 1949; Berkson and Gage

1952; Cutler and Axtell, 1963; Haybittle, 1965; Farewell, 1977, 1982, 1986; Schmidt and

Witte, 1988; Kuk and Chen, 1992; Maller, 1993; Moulton and Halsey, 1995):

(y;µ, φ) = (1 − ξ)z + ξ

F(y

;µ, φ) − F(y

;µ, φ)

where z is a binary indicator taking the value one if an observation is right censored and

zero otherwise, so that 1 − ξ is the probability of belonging to the group never having the

event. (Again, the integral can be approximated by a density.) Then, this probability can

also be allowed to vary with the conditions:

log

1 − ξ

= g

, β

)

I have written two general functions in the statistical language, R (Ihaka and Gentleman,

1996), that are available from me. They handle these two double regression models for

the nine distributions described above, as well as a number of discrete distributions. For

the continuous distributions, the user can choose either the usual density-based likelihood

or that based on the difference of cumulative functions in Equation (2). If g

(·) is a linear

function of the parameters, the regression model may be speciﬁed using the Wilkinson and

Rogers (1973) notation.

In the examples to follow, the inference criterion for comparing the models under con-

sideration, whether differing in functional form or in the number of parameters, will be

their ability to predict the observed data, that is how probable they make the data. In other

words, they will be compared directly through the minimized −log likelihood. When the

numbers of parameters in models differ, this may be penalized by adding the number of

estimated parameters, a form of the Akaike information criterion (AIC, see Akaike, 1973).

334 LINDSEY

Table 1. Intervals, in months, between

visits within which subjects changed

from HIV-negative to positive and the

corresponding frequencies, n

, with

time measured starting in December,

1979, from Carstensen (1996).

0242428∞ 8

0 39 2 39 57 3

24 28 4 39 113 2

24 39 1 39 ∞ 15

24 57 10 57 88 5

24 88 3 57 113 1

24 113 4 57 ∞ 22

24 ∞ 61 88 113 1

28 39 4 88 ∞ 34

28 88 1 113 ∞ 92

Smaller values indicate relatively more preferable models. Intervals of precision for the

parameters of interest will be constructed using normed proﬁle likelihoods (that is, the

likelihood is normed by dividing by its maximum value and the proﬁle obtained by varying

the parameter of interest over the range of values under study while maximizing over all

other parameters).

In interval censored data, the likelihoods of parametric and nonparametric models are

not generally comparable because the latter do not give the probability of right-censored

observations. These are only used conditionally, in the risk set. However, I shall provide

some nonparametric results for visual comparison in graphs. Because this is not the centre

of interest, I shall use midpoints for calculating Kaplan-Meier estimates, rather than the

more sophisticated procedures of Turnbull (1974, 1976). The examples in Lindsey and

Ryan (1998) conﬁrm that this is justiﬁed.

3. HIV Infection

I shall ﬁrst consider an example of highly censored observations with no explanatory vari-

ables. Carstensen (1996) gives data on diagnosis of 297 Danish homosexuals for HIV

antibody positivity at six widely spaced time points between December, 1981, and May,

1989. Many people were not present for all visits. An additional complicating problem

with these data is that the time origin, when all individuals were uninfected, is unknown;

the data are doubly censored. Following Carstensen, I assume that the time origin is the

same for all individuals, provisionally taking this to be December 1979, and present the

data in this form in Table 1.

Thus, one question concerns the point at which individuals were not yet infected. A

second question relates to estimation of the proportion of the group who were HIV-positive

by 1990. This is intimately linked to a third question: is there a subgroup that will never

剩余25页未读，继续阅读

sweetnur

粉丝: 0
资源: 1

参数回归模型中区间 censoring 的研究与应用

Design of Interval Observers for uncertain dynamical systems.pdf

API SPEC 19ICV-2023 - Interval Control Valves (ICV).pdf

修改程序，将self.interval的值显示在interval.entry上

x(t) = cos(45πt) Generate samples from x(t). The sampling interval is 0.005 sec. Denote the samples as x1[n]. Plot x1[n].

python中 ，a = Interval(-inf, 462.0, closed='right') , b = Interval(462.0, 478.0, closed='right') ,c =Interval(478.0, inf, closed='right')。怎么由a,b ,c 得要列表[-np.inf,462, 478, np.inf]

Use your program from Problem 7.8 and the forearm data to get a bootstrap confidence interval for the mean. Compare this to the theoretical one. Chapter 7:Monte Carlo Methods for Inferential Statistics

Write MATLAB code that will get the bootstrap standard confidence interval. Use it with the forearm data to get a confidence interval for the sample central second moment. Compare this interval with the ones obtained in the examples and in the previous problem.

95% confidence interval for revenue: (124.51492344311112, 138.29892562882492) 95% confidence interval for rating: (6.323994854867224, 6.38380656357248) 什么意思

x(t) = cos(45πt)，Generate samples from x(t). The sampling interval is 0.005 sec. Denote the samples as x1[n]. Plot x1[n].dont use python

if 0 != iteration: threading.Timer(delay, repeat, (fn, iteration - 1, interval, True)).start()优化这段代码

写一个适用于高维特征的spline regression，带有l2惩罚项，要求给出所有的权重，只能使用numpy库，并且写一个示例代码，输入为13维的特征

x(t) = cos(45πt)，Generate samples from x(t). The sampling interval is 0.005 sec. Denote the samples as x1[n]. Plot x1[n].

Two Sample t-test data: `1`$y1 and `2`$y1 t = 3.8879, df = 37, p-value = 0.0004049 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 7.146246 22.701122 sample estimates: mean of x mean of y 194.4737 179.5500 结果分析

interval tree source code in c

Implement a Matlab function which completes the following task: Give the minimax approximation of a given function f on (a)a closed interval; (b)a set of finitely many points.

讲一下代码转成TS local _max = _mall(2).break_even_interval local _tmpBuy = self.data.yuanbaoTotalBuyTime % _max return _max - _tmpBuy转成TS

最新资源

python中，a = Interval(-inf, 462.0, closed='right') , b = Interval(462.0, 478.0, closed='right') ,c =Interval(478.0, inf, closed='right')。怎么由a,b ,c 得要列表[-np.inf,462, 478, np.inf]