12 Overview of the Nodes Chapter 1
hidden layer is fully connected to the next hidden layer, and the last hidden layer is
fully connected to the output. The Neural Network node supports many variations of
this general form.
The Princomp/Dmneural node enables you to fit an additive nonlinear model that uses
the bucketed principal components as inputs to predict a binary or an interval target
variable. The node also performs a principal components analysis and passes the
principal components to the successor nodes.
The User Defined Model node enables you to generate assessment statistics by using
predicted values from a model that you built with the SAS Code node (for example, a
logistic model that uses the SAS/STAT LOGISTIC procedure) or the Variable Selection
node. You can also generate assessment statistics for models that are built by a
third-party software product when you create a SAS data set that contains the
predicted values from the model. The predicted values can also be saved to a SAS data
set and then imported into the process flow with the Input Data Source node.
The Ensemble node creates a new model by averaging the posterior probabilities (for
class targets) or the predicted values (for interval targets) from multiple models. The
new model is then used to score new data. One common approach is to resample the
training data and fit a separate model for each sample. The Ensemble node then
integrates the component models to form a potentially stronger solution. Another
common approach is to use multiple modeling methods, such as a neural network and a
decision tree, to obtain separate models from the same training data set.
The Ensemble node integrates the component models from the complementary
modeling methods to form the final model solution. The Ensemble node can also be
used to combine the scoring code from stratified models. The modeling nodes generate
different scoring formulas when they operate on a stratification variable (for example, a
group variable such as Gender) that you define in a Group Processing node. The
Ensemble node combines the scoring code into a single DATA step by logically dividing
the data into blocks by using IF-THEN DO/END statements.
It is important to note that the ensemble model that is created from either approach
can be more accurate than the individual models only if the individual models differ.