xiv
PREFACE
• R is used as a tool not only for calculation and data analysis but also to illus-
trate the concepts of probability, to simulate distributions, and to explore by
experimentation different scenarios in decision making. The R books currently
available skim over the concepts of probability, and concentrate on using it for
statistical inference and modeling.
• Recognizing that the student better understands definitions, generalizations, and
abstractions after seeing the applications, almost all new ideas are introduced
and illustrated by real examples, covering a wide range of computer science
applications.
Although we have addressed in the first instance computer scientists, we believe that
this book should also be suitable for students of engineering and of mathematics.
There are in all five parts to the book, starting with Part I on an introduction to
R. This presents the procedures of R needed to summarize and provide graphical
displays of statistical data. An introduction to programming in R is also included.
Not meant to be a manual, this part is intended only to get the student started. As we
progress, more procedures of R are introduced as the need arises.
Part II sets the foundations of probability and introduces the functions available
in R for examining them. R is used not only for calculating probabilities involving
unwieldy computations but also for obtaining probabilities through simulation. Prob-
ability events and sample spaces are illustrated with the usual gambling experiments,
as well as inspection of integrated circuit chips, and observation of randomness in
computer
programming.
A discussion of the "Intel Chip Fiasco" leads to the "balls and
bins"
problem, which in turn is applied to assigning jobs to processors. It is shown
how Bayes' theorem has important applications in modern-day computer science
such as machine learning and translation. Methods to assess reliability of a computer
containing many systems, which in turn contain many components, are considered.
Part III deals with discrete random variables and expectation. Nearly every
chapter opens with a sequence of examples, designed to motivate the detail that
follows. Techniques are developed for examining discrete variables by simulation
in R. The objective is to enable students to approximate parameters without having
sufficient mathematical knowledge to derive them exactly. The Bernoulli, geometric,
binomial, hypergeometric, and Poisson distributions are each dealt with in a simi-
lar fashion, beginning with a set of examples with different parameters and using
the graphical facilities in R to examine their distributions. Limiting distributions are
exhibited through simulation, and the students use R to obtain rules of thumb to es-
tablish when these approximations are valid. R is also used to design single- and
double-sampling inspection schemes.
Part IV deals with continuous random variables. The exponential distribution is
introduced as the waiting time between Poisson occurrences, and the graphical facil-
ities of R illustrate the models. The Markov memoryless property is simulated using
R. Some applications of the exponential distribution are investigated, notably in the
areas of reliability and queues. R is used to model response times with varying traffic
intensities. We have examined models for server queue lengths without using any of