
4 Computational Statistics Handbook with M
ATLAB
will be used in later chapters of the book. Chapter 3 covers some of the basic
ideas of statistics and sampling distributions. Since many of the methods in
computational statistics are concerned with estimating distributions via sim-
ulation, this chapter is fundamental to the rest of the book. For the same rea-
son, we present some techniques for generating random variables in
Chapter 4.
Some of the methods in computational statistics enable the researcher to
explore the data before other analyses are performed. These techniques are
especially important with high dimensional data sets or when the questions
to be answered using the data are not well focused. In Chapter 5, we present
some graphical exploratory data analysis techniques that could fall into the
category of traditional statistics (e.g., box plots, scatterplots). We include
them in this text so statisticians can see how to implement them in MATLAB
and to educate scientists and engineers as to their usage in exploratory data
analysis. Other graphical methods in this chapter do fall into the category of
computational statistics. Among these are isosurfaces, parallel coordinates,
the grand tour and projection pursuit.
In Chapters 6 and 7, we present methods that come under the general head-
ing of resampling. We first cover some of the general concepts in hypothesis
testing and confidence intervals to help the reader better understand what
follows. We then provide procedures for hypothesis testing using simulation,
including a discussion on evaluating the performance of hypothesis tests.
This is followed by the bootstrap method, where the data set is used as an
estimate of the population and subsequent sampling is done from the sam-
ple. We show how to get bootstrap estimates of standard error, bias and con-
fidence intervals. Chapter 7 continues with two closely related methods
called jackknife and cross-validation.
One of the important applications of computational statistics is the estima-
tion of probability density functions. Chapter 8 covers this topic, with an
emphasis on the nonparametric approach. We show how to obtain estimates
using probability density histograms, frequency polygons, averaged shifted
histograms, kernel density estimates, finite mixtures and adaptive mixtures.
Chapter 9 uses some of the concepts from probability density estimation
and cross-validation. In this chapter, we present some techniques for statisti-
cal pattern recognition. As before, we start with an introduction of the classi-
cal methods and then illustrate some of the techniques that can be considered
part of computational statistics, such as classification trees and clustering.
In Chapter 10 we describe some of the algorithms for nonparametric
regression and smoothing. One nonparametric technique is a tree-based
method called regression trees. Another uses the kernel densities of
Chapter 8. Finally, we discuss smoothing using loess and its variants.
An approach for simulating a distribution that has become widely used
over the last several years is called Markov chain Monte Carlo. Chapter 11
covers this important topic and shows how it can be used to simulate a pos-
terior distribution. Once we have the posterior distribution, we can use it to
estimate statistics of interest (means, variances, etc.).
© 2002 by Chapman & Hall/CRC