没有合适的资源?快使用搜索试试~ 我知道了~
首页SAS_STAT_9.2_User's_Guide
SAS_STAT_9.2_User's_Guide
需积分: 9 12 下载量 31 浏览量
更新于2023-03-03
评论
收藏 1.41MB PDF 举报
SAS_STAT_9.2_User's_Guide SAS_STAT_9.2_User's_Guide
资源详情
资源评论
资源推荐
SAS/STAT
®
9.2 User’s Guide
Introduction to Statistical
Modeling with SAS/STAT
Software
(Book Excerpt)
SAS
®
Documentation
This document is an individual chapter from SAS/STAT
®
9.2 User’s Guide.
The correct bibliographic citation for the complete manual is as follows: SAS Institute Inc. 2008. SAS/STAT
®
9.2
User’s Guide. Cary, NC: SAS Institute Inc.
Copyright © 2008, SAS Institute Inc., Cary, NC, USA
All rights reserved. Produced in the United States of America.
For a Web download or e-book: Your use of this publication shall be governed by the terms established by the vendor
at the time you acquire this publication.
U.S. Government Restricted Rights Notice: Use, duplication, or disclosure of this software and related documentation
by the U.S. government is subject to the Agreement with SAS Institute and the restrictions set forth in FAR 52.227-19,
Commercial Computer Software-Restricted Rights (June 1987).
SAS Institute Inc., SAS Campus Drive, Cary, North Carolina 27513.
1st electronic book, March 2008
2nd electronic book, February 2009
SAS
®
Publishing provides a complete selection of books and electronic products to help customers use SAS software to
its fullest potential. For more information about our e-books, e-learning products, CDs, and hard-copy books, visit the
SAS Publishing Web site at support.sas.com/publishing or call 1-800-727-3228.
SAS
®
and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute
Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or trademarks of their respective companies.
Chapter 3
Introduction to Statistical Modeling with
SAS/STAT Software
Contents
Overview: Statistical Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Classes of Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Linear and Nonlinear Models . . . . . . . . . . . . . . . . . . . . . 31
Regression Models and Models with Classification Effects . . . . . 33
Univariate and Multivariate Models . . . . . . . . . . . . . . . . . 34
Fixed, Random, and Mixed Models . . . . . . . . . . . . . . . . . 35
Generalized Linear Models . . . . . . . . . . . . . . . . . . . . . . 37
Latent Variable Models . . . . . . . . . . . . . . . . . . . . . . . . 38
Bayesian Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Classical Estimation Principles . . . . . . . . . . . . . . . . . . . . . . . . 42
Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Inference Principles for Survey Data . . . . . . . . . . . . . . . . . 47
Statistical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Hypothesis Testing and Power . . . . . . . . . . . . . . . . . . . . . . . . . 49
Important Linear Algebra Concepts . . . . . . . . . . . . . . . . . . . . . . 49
Expectations of Random Variables and Vectors . . . . . . . . . . . . . . . . 57
Mean Squared Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Linear Model Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Finding the Least Squares Estimators . . . . . . . . . . . . . . . . 62
Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . . . 63
Estimating the Error Variance . . . . . . . . . . . . . . . . . . . . 65
Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . . 65
Estimable Functions . . . . . . . . . . . . . . . . . . . . . . . . . 66
Test of Hypotheses . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Residual Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Sweep Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
28 F Chapter 3: Introduction to Statistical Modeling with SAS/STAT Software
Overview: Statistical Modeling
There are more than 70 procedures in SAS/STAT software, and the majority of them are dedicated
to solving problems in statistical modeling. The goal of this chapter is to provide a roadmap to
statistical models and to modeling tasks, enabling you to make informed choices about the appro-
priate modeling context and tool. This chapter also introduces important terminology, notation,
and concepts used throughout this documentation. Subsequent introductory chapters discuss model
families and related procedures.
It is difficult to capture the complexity of statistical models in a simple scheme, so the classification
used here is necessarily incomplete. It is most practical to classify models in terms of simple criteria,
such as the presence of random effects, the presence of nonlinearity, characteristics of the data, and
so on. That is the approach used here. After a brief introduction to statistical modeling in general
terms, the chapter describes a number of model classifications and relates them to modeling tools
in SAS/STAT software.
Statistical Models
Deterministic and Stochastic Models
Purely mathematical models, in which the relationships between inputs and outputs are captured
entirely in deterministic fashion, can be important theoretical tools but are impractical for describing
observational, experimental, or survey data. For such phenomena, researchers usually allow the
model to draw on stochastic as well as deterministic elements. When the uncertainty of realizations
leads to the inclusion of random components, the resulting models are called stochastic models.
A statistical model, finally, is a stochastic model that contains parameters, which are unknown
constants that need to be estimated based on assumptions about the model and the observed data.
There are many reasons why statistical models are preferred over deterministic models. For exam-
ple:
Randomness is often introduced into a system in order to achieve a certain balance or rep-
resentativeness. For example, random assignment of treatments to experimental units allows
unbiased inferences about treatment effects. As another example, selecting individuals for a
survey sample by random mechanisms ensures a representative sample.
Even if a deterministic model can be formulated for the phenomenon under study, a stochastic
model can provide a more parsimonious and more easily comprehended description. For
example, it is possible in principle to capture the result of a coin toss with a deterministic
model, taking into account the properties of the coin, the method of tossing, conditions of the
medium through which the coin travels and of the surface on which it lands, and so on. A
very complex model is required to describe the simple outcome—heads or tails. Alternatively,
you can describe the outcome quite simply as the result of a stochastic process, a Bernoulli
variable that results in heads with a certain probability.
Statistical Models F 29
It is often sufficient to describe the average behavior of a process, rather than each particular
realization. For example, a regression model might be developed to relate plant growth to
nutrient availability. The explicit aim of the model might be to describe how the average
growth changes with nutrient availability, not to predict the growth of an individual plant.
The support for the notion of averaging in a model lies in the nature of expected values,
describing typical behavior in the presence of randomness. This, in turn, requires that the
model contain stochastic components.
The defining characteristic of statistical models is their dependence on parameters and the incorpo-
ration of stochastic terms. The properties of the model and the properties of quantities derived from
it must be studied in a long-run, average sense through expectations, variances, and covariances.
The fact that the parameters of the model must be estimated from the data introduces a stochastic
element in applying a statistical model: because the model is not deterministic but includes random-
ness, parameters and related quantities derived from the model are likewise random. The properties
of parameter estimators can often be described only in an asymptotic sense, imagining that some
aspect of the data increases without bound (for example, the number of observations or the number
of groups).
The process of estimating the parameters in a statistical model based on your data is called fitting
the model. For many classes of statistical models there are a number of procedures in SAS/STAT
software that can perform the fitting. In many cases, different procedures solve identical estima-
tion problems—that is, their parameter estimates are identical. In some cases, the same model
parameters are estimated by different statistical principles, such as least squares versus maximum
likelihood estimation. Parameter estimates obtained by different methods typically have differ-
ent statistical properties—distribution, variance, bias, and so on. The choice between competing
estimation principles is often made on the basis of properties of the estimators. Distinguishing
properties might include (but are not necessarily limited to) computational ease, interprative ease,
bias, variance, mean squared error, and consistency.
Model-Based and Design-Based Randomness
A statistical model is a description of the data-generating mechanism, not a description of the spe-
cific data to which it is applied. The aim of a model is to capture those aspects of a phenomenon
that are relevant to inquiry and to explain how the data could have come about as a realization of
a random experiment. These relevant aspects might include the genesis of the randomness and the
stochastic effects in the phenomenon under study. Different schools of thought can lead to different
model formulations, different analytic strategies, and different results. Coarsely, you can distinguish
between a viewpoint of innate randomness and one of induced randomness. This distinction leads
to model-based and design-based inference approaches.
In a design-based inference framework, the random variation in the observed data is induced by
random selection or random assignment. Consider the case of a survey sample from a finite popula-
tion of size N ; suppose that F
N
D fy
i
W i 2 U
N
g denotes the finite set of possible values and U
N
is the index set U
N
D f1; 2; : : : ; N g. Then a sample S, a subset of U
N
, is selected by probability
rules. The realization of the random experiment is the selection of a particular set S; the associated
values selected from F
N
are considered fixed. If properties of a design-based sampling estimator
剩余59页未读,继续阅读
charles_y_tang
- 粉丝: 0
- 资源: 10
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- c++校园超市商品信息管理系统课程设计说明书(含源代码) (2).pdf
- 建筑供配电系统相关课件.pptx
- 企业管理规章制度及管理模式.doc
- vb打开摄像头.doc
- 云计算-可信计算中认证协议改进方案.pdf
- [详细完整版]单片机编程4.ppt
- c语言常用算法.pdf
- c++经典程序代码大全.pdf
- 单片机数字时钟资料.doc
- 11项目管理前沿1.0.pptx
- 基于ssm的“魅力”繁峙宣传网站的设计与实现论文.doc
- 智慧交通综合解决方案.pptx
- 建筑防潮设计-PowerPointPresentati.pptx
- SPC统计过程控制程序.pptx
- SPC统计方法基础知识.pptx
- MW全能培训汽轮机调节保安系统PPT教学课件.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0