SAS Enterprise Miner实战案例教程（第二版）

5星 · 超过95%的资源需积分: 10 167 浏览量更新于2025-01-07 6 收藏 2.27MB PDF 举报

《SAS® Enterprise Miner 应用实例教程（第二版）》是一本深入介绍如何利用SAS Enterprise Miner进行数据挖掘的实用指南。该教程专为希望通过案例研究方法学习和实践数据分析的读者设计，由SAS Institute Inc.于2003年出版。书中内容覆盖了SAS Enterprise Miner这款强大的商业智能工具，它允许用户从海量数据中发现隐藏的模式和趋势，从而支持决策制定。本书的主要特点包括： 1. **理论与实践结合**：作者通过一系列具体案例，展示了如何将SAS Enterprise Miner应用于实际问题解决，使读者能够理解和掌握数据挖掘的基本概念、技术和方法。 2. **第二版更新**：作为第二版，它可能包含了对软件的新功能介绍，以及针对当时最新版本的优化指导，确保内容的时效性和实用性。 3. **版权和许可**：所有权利归SAS Institute Inc.所有，使用本书需遵循供应商提供的条款，对于美国政府用户，其使用还需符合FAR 52.227-19中规定的商业计算机软件限制权。 4. **SAS产品支持**：SAS Publishing提供丰富的图书和电子产品，旨在帮助客户充分利用SAS软件的所有潜能，这表明该教程不仅仅是技术指导，也是用户学习整个SAS生态系统的一部分。 5. **出版信息**：第一版印刷日期为2003年4月，反映了当时的出版状况，但可能包含了一些过时的细节或示例，现代用户在阅读时需要注意这部分信息的时效性。通过阅读这本书，读者可以期待掌握如下关键知识点： - SAS Enterprise Miner的工作原理和架构 - 数据预处理和清洗的技术 - 数据可视化与探索性数据分析 - 常见的数据挖掘算法（如分类、回归、聚类和关联规则） - 模型评估和选择 - 实施预测模型并解释结果 - 如何将数据挖掘应用于业务场景中的实际问题《SAS® Enterprise Miner 应用实例教程（第二版）》是一本实用的工具书，适合希望在数据科学领域提升技能的专业人士和初学者，无论是作为入门教材还是参考手册，都能为数据挖掘实践提供有力支持。

Overview of the Nodes 11

The Interactive Grouping node enables you to interactively group variable values into

classes. Statistical and plotted information can be interactively rearranged as you

explore various variable groupings. The Interactive Grouping node requires a binary

target variable.

Model Nodes

The Regression node enables you to ﬁt both linear and logistic regression models to

your data. You can use both continuous and discrete variables as inputs. The node

supports the stepwise, forward, and backward-selection methods. A point-and-click

interaction builder enables you to create higher-order modeling terms.

The Tree node enables you to perform multiway splitting of your database, based on

nominal, ordinal, and continuous variables. This is the SAS implementation of decision

trees, which represents a hybrid of the best of CHAID, CART, and C4.5 algorithms. The

node supports both automatic and interactive training. When you run the Tree node in

automatic mode, it automatically ranks the input variables by the strength of their

contribution to the tree. This ranking can be used to select variables for use in

subsequent modeling. In addition, dummy variables can be generated for use in

subsequent modeling. Using interactive training, you can override any automatic step

by deﬁning a splitting rule or by pruning a node or subtree.

The Neural Network node enables you to construct, train, and validate multilayer

feed-forward neural networks. By default, the Neural Network node automatically

constructs a multilayer feed-forward network that has one hidden layer consisting of

three neurons. In general, each input is fully connected to the ﬁrst hidden layer, each

12 Overview of the Nodes Chapter 1

hidden layer is fully connected to the next hidden layer, and the last hidden layer is

fully connected to the output. The Neural Network node supports many variations of

this general form.

The Princomp/Dmneural node enables you to ﬁt an additive nonlinear model that uses

the bucketed principal components as inputs to predict a binary or an interval target

variable. The node also performs a principal components analysis and passes the

principal components to the successor nodes.

The User Deﬁned Model node enables you to generate assessment statistics by using

predicted values from a model that you built with the SAS Code node (for example, a

logistic model that uses the SAS/STAT LOGISTIC procedure) or the Variable Selection

node. You can also generate assessment statistics for models that are built by a

third-party software product when you create a SAS data set that contains the

predicted values from the model. The predicted values can also be saved to a SAS data

set and then imported into the process ﬂow with the Input Data Source node.

The Ensemble node creates a new model by averaging the posterior probabilities (for

class targets) or the predicted values (for interval targets) from multiple models. The

new model is then used to score new data. One common approach is to resample the

training data and ﬁt a separate model for each sample. The Ensemble node then

integrates the component models to form a potentially stronger solution. Another

common approach is to use multiple modeling methods, such as a neural network and a

decision tree, to obtain separate models from the same training data set.

The Ensemble node integrates the component models from the complementary

modeling methods to form the ﬁnal model solution. The Ensemble node can also be

used to combine the scoring code from stratiﬁed models. The modeling nodes generate

different scoring formulas when they operate on a stratiﬁcation variable (for example, a

group variable such as Gender) that you deﬁne in a Group Processing node. The

Ensemble node combines the scoring code into a single DATA step by logically dividing

the data into blocks by using IF-THEN DO/END statements.

It is important to note that the ensemble model that is created from either approach

can be more accurate than the individual models only if the individual models differ.

剩余134页未读，继续阅读

hzl7850

粉丝: 0
资源: 15

SAS Enterprise Miner实战案例教程（第二版）

SAS统计分析教材与详解

SAS应用统计分析 第5版 带书签目录 完整版

基于SAS_EM的图书借阅数据关联规则数据挖掘.pdf

White+Paper-VNX+for+Performance+and+Avai....pdf

ThinkServer_x50系列服务器_SAS+RAID配置手册-38.pdf

sas编程习题与实例应用.pdf

Sas数据挖掘应用指导教程.pdf

SAS+EM决策树交互式

SAS_EM实现神经网络.pdf

SAS+FusionPBX方案配置手册.docx

最新资源

SAS应用统计分析第5版带书签目录完整版