xvi PREFACE
a steep learning curve, we try to break this down into small, task-oriented
steps.
In this edition we place a greater emphasis on more idiomatic R. For a
small example, despite the greater familiarity of using = for the assignment
operator, we now use the <- operator. Another example comes in Chapter 4,
where we resist the temptation to illustrate some data manipulations with
the widely used plyr package and instead utilize similar functions from base
R. For our limited demands, the corner cases that led to the desire for a plyr-
type approach are not present, and we have the belief that it is good to start
with a grounding in the functionality provided by base R.
We also try to avoid as many of the pitfalls as possible for new R users by
encouraging the use of RStudio, a feature-rich, cross-platform development
environment for interacting with R. RStudio has very good integration with
R’s help system and its administrative tools; it has an integrated debugger, a
powerful editor, and much more. Though relatively new to the R community,
the company has already made an enormous contribution.
This book was written using the excellent knitr package for R. This pack-
age allows one to embed R code into a document with ease. The formatting
of code blocks follows a convention championed by the knitr author. We
think it makes the code much easier to read, and hence, reason about. It also
encourages thinking of interacting with R using a script, rather than the com-
mand line directly. This style of usage is facilitated by RStudio.
In addition to changes with R, the teaching of introductory statistics (by
which we mean a non-calculus approach to inferential statistics) has changed
in the last decade, or so. For example, primarily due to the widespread avail-
ability of computational resources but also for pedagogical reasons, there
have been pushes to include resampling approaches, permutation methods,
and Bayesian analysis into the first-year course. The topics of this text hew
closely to the traditional ones, be we have added a bit on these computer-
intensive approaches, in particular to motivate the more traditional approach.
We continue with an emphasis on realistic data and examples (which re-
quired updating some now not-so-topical examples) and we rely on visual-
ization techniques to gather insight. Fortunately, the R language makes such
inclusion quite easy.
Organization The text has three main parts. The first five chapters intro-
duce the basics of exploratory data analysis and data manipulation in R. The
approach is a little slower than it need be. We postpone until Chapter 4 the
details of using R’s data frames. These are the primary means to store mul-
tivariate data in R, and in Chapters 4 and 5 we demonstrate many tools that
can act with data frames to make data investigation very convenient. How-
ever, most of these techniques are a bit more abstract, so in the first chapters
we emphasize a more direct, easier to learn approach, albeit sometimes more
tedious. Most all of this material was rewritten for the second edition.