SAS_STAT_9.2_User's_Guide

需积分: 9 31 浏览量更新于2023-03-03 评论收藏 1.41MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源评论

资源推荐

SAS/STAT

9.2 User’s Guide

Introduction to Statistical

Modeling with SAS/STAT

Software

(Book Excerpt)

SAS

Documentation

Chapter 3

Introduction to Statistical Modeling with

SAS/STAT Software

Contents

Overview: Statistical Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Classes of Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Linear and Nonlinear Models . . . . . . . . . . . . . . . . . . . . . 31

Regression Models and Models with Classiﬁcation Effects . . . . . 33

Univariate and Multivariate Models . . . . . . . . . . . . . . . . . 34

Fixed, Random, and Mixed Models . . . . . . . . . . . . . . . . . 35

Generalized Linear Models . . . . . . . . . . . . . . . . . . . . . . 37

Latent Variable Models . . . . . . . . . . . . . . . . . . . . . . . . 38

Bayesian Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Classical Estimation Principles . . . . . . . . . . . . . . . . . . . . . . . . 42

Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Inference Principles for Survey Data . . . . . . . . . . . . . . . . . 47

Statistical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Hypothesis Testing and Power . . . . . . . . . . . . . . . . . . . . . . . . . 49

Important Linear Algebra Concepts . . . . . . . . . . . . . . . . . . . . . . 49

Expectations of Random Variables and Vectors . . . . . . . . . . . . . . . . 57

Mean Squared Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Linear Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Finding the Least Squares Estimators . . . . . . . . . . . . . . . . 62

Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . . . 63

Estimating the Error Variance . . . . . . . . . . . . . . . . . . . . 65

Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . 65

Estimable Functions . . . . . . . . . . . . . . . . . . . . . . . . . 66

Test of Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Residual Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Sweep Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

28 F Chapter 3: Introduction to Statistical Modeling with SAS/STAT Software

Overview: Statistical Modeling

There are more than 70 procedures in SAS/STAT software, and the majority of them are dedicated

to solving problems in statistical modeling. The goal of this chapter is to provide a roadmap to

statistical models and to modeling tasks, enabling you to make informed choices about the appro-

priate modeling context and tool. This chapter also introduces important terminology, notation,

and concepts used throughout this documentation. Subsequent introductory chapters discuss model

families and related procedures.

It is difﬁcult to capture the complexity of statistical models in a simple scheme, so the classiﬁcation

used here is necessarily incomplete. It is most practical to classify models in terms of simple criteria,

such as the presence of random effects, the presence of nonlinearity, characteristics of the data, and

so on. That is the approach used here. After a brief introduction to statistical modeling in general

terms, the chapter describes a number of model classiﬁcations and relates them to modeling tools

in SAS/STAT software.

Statistical Models

Deterministic and Stochastic Models

Purely mathematical models, in which the relationships between inputs and outputs are captured

entirely in deterministic fashion, can be important theoretical tools but are impractical for describing

observational, experimental, or survey data. For such phenomena, researchers usually allow the

model to draw on stochastic as well as deterministic elements. When the uncertainty of realizations

leads to the inclusion of random components, the resulting models are called stochastic models.

A statistical model, ﬁnally, is a stochastic model that contains parameters, which are unknown

constants that need to be estimated based on assumptions about the model and the observed data.

There are many reasons why statistical models are preferred over deterministic models. For exam-

ple:

 Randomness is often introduced into a system in order to achieve a certain balance or rep-

resentativeness. For example, random assignment of treatments to experimental units allows

unbiased inferences about treatment effects. As another example, selecting individuals for a

survey sample by random mechanisms ensures a representative sample.

 Even if a deterministic model can be formulated for the phenomenon under study, a stochastic

model can provide a more parsimonious and more easily comprehended description. For

example, it is possible in principle to capture the result of a coin toss with a deterministic

model, taking into account the properties of the coin, the method of tossing, conditions of the

medium through which the coin travels and of the surface on which it lands, and so on. A

very complex model is required to describe the simple outcome—heads or tails. Alternatively,

you can describe the outcome quite simply as the result of a stochastic process, a Bernoulli

variable that results in heads with a certain probability.

Statistical Models F 29

 It is often sufﬁcient to describe the average behavior of a process, rather than each particular

realization. For example, a regression model might be developed to relate plant growth to

nutrient availability. The explicit aim of the model might be to describe how the average

growth changes with nutrient availability, not to predict the growth of an individual plant.

The support for the notion of averaging in a model lies in the nature of expected values,

describing typical behavior in the presence of randomness. This, in turn, requires that the

model contain stochastic components.

The deﬁning characteristic of statistical models is their dependence on parameters and the incorpo-

ration of stochastic terms. The properties of the model and the properties of quantities derived from

it must be studied in a long-run, average sense through expectations, variances, and covariances.

The fact that the parameters of the model must be estimated from the data introduces a stochastic

element in applying a statistical model: because the model is not deterministic but includes random-

ness, parameters and related quantities derived from the model are likewise random. The properties

of parameter estimators can often be described only in an asymptotic sense, imagining that some

aspect of the data increases without bound (for example, the number of observations or the number

of groups).

The process of estimating the parameters in a statistical model based on your data is called ﬁtting

the model. For many classes of statistical models there are a number of procedures in SAS/STAT

software that can perform the ﬁtting. In many cases, different procedures solve identical estima-

tion problems—that is, their parameter estimates are identical. In some cases, the same model

parameters are estimated by different statistical principles, such as least squares versus maximum

likelihood estimation. Parameter estimates obtained by different methods typically have differ-

ent statistical properties—distribution, variance, bias, and so on. The choice between competing

estimation principles is often made on the basis of properties of the estimators. Distinguishing

properties might include (but are not necessarily limited to) computational ease, interprative ease,

bias, variance, mean squared error, and consistency.

Model-Based and Design-Based Randomness

A statistical model is a description of the data-generating mechanism, not a description of the spe-

ciﬁc data to which it is applied. The aim of a model is to capture those aspects of a phenomenon

that are relevant to inquiry and to explain how the data could have come about as a realization of

a random experiment. These relevant aspects might include the genesis of the randomness and the

stochastic effects in the phenomenon under study. Different schools of thought can lead to different

model formulations, different analytic strategies, and different results. Coarsely, you can distinguish

between a viewpoint of innate randomness and one of induced randomness. This distinction leads

to model-based and design-based inference approaches.

In a design-based inference framework, the random variation in the observed data is induced by

random selection or random assignment. Consider the case of a survey sample from a ﬁnite popula-

tion of size N ; suppose that F

D fy

W i 2 U

g denotes the ﬁnite set of possible values and U

is the index set U

D f1; 2; : : : ; N g. Then a sample S, a subset of U

, is selected by probability

rules. The realization of the random experiment is the selection of a particular set S; the associated

values selected from F

are considered ﬁxed. If properties of a design-based sampling estimator

剩余59页未读，继续阅读

charles_y_tang

粉丝: 0
资源: 10

会员权益专享

SAS_STAT_9.2_User's_Guide

评论0

会员权益专享

最新资源

SAS_STAT_9.2_User's_Guide

评论0

k10stat154

k10stat091节能

k10超频工具

PORT_STAT_NOINFO, PORT_STAT_APPLE_10W = 8, PORT_STAT_SAMSUNG, PORT_STAT_APPLE_5W, PORT_STAT_APPLE_12W, PORT_STAT_UNKNOWN_TA, PORT_STAT_SDP, PORT_STAT_CDP, PORT_STAT_DCP

io_stat_resource_free和io_stat_resource_exit

#define IPSTATS_STAT_DESC_XSTATS_LEAF(NAME) { \ .name = (NAME), \ .kind = IPSTATS_STAT_DESC_KIND_LEAF, \ .show = &ipstats_stat_desc_show_xstats, \ .pack = &ipstats_stat_desc_pack_xstats, \ }

pg_stat和pg_tmp_stat

typedef struct { PHASE_WORK_STAT A_down:1; PHASE_WORK_STAT B_down:1; PHASE_WORK_STAT C_down:1; PHASE_WORK_STAT A_up:1; PHASE_WORK_STAT B_up:1; PHASE_WORK_STAT C_up:1; PHASE_WORK_STAT reserve:2; }PHASE_WORK_STAT_BIT;

以下哪个系统视图存储的是和当前用户查询相关的信息? A.PG_USER B.PG_STAT_ACTIVITY C.PG_STATS D.PG_STAT_USER

img_stat工具使用

postgresql12.7 如何安装pg_stat_statements 扩展

kali img_stat工具使用

cv2.CC_STAT_AREA

OS_TASK_STAT_STK_CHK_EN

greeplumn版本6.16.3怎么安装pg_stat_statements模块

postgres 如何查看应用具体sql ，类似pg_stat_activity

__IRQ_STAT宏定义说明了什么

会员权益专享

最新资源