PREFACE
xviii
and set off to learn it. I was an experienced statistician and researcher, had 25 years
experience as an SAS and SPSS programmer, and was fluent in a half dozen program-
ming languages. How hard could it be? Famous last words.
As I tried to learn the language (as fast as possible, with an interview looming), I
found either tomes on the underlying structure of the language or dense treatises on
specific advanced statistical methods, written by and for subject-matter experts. The
online help was written in a spartan style that was more reference than tutorial. Every
time I thought I had a handle on the overall organization and capabilities of R, I
found something new that made me feel ignorant and small.
To make sense of it all, I approached R as a data scientist. I thought about what it
takes to successfully process, analyze, and understand data, including
■
Accessing the data (getting the data into the application from multiple sources)
■
Cleaning the data (coding missing data, fixing or deleting miscoded data, trans-
forming variables into more useful formats)
■
Annotating the data (in order to remember what each piece represents)
■
Summarizing the data (getting descriptive statistics to help characterize the
data)
■
Visualizing the data (because a picture really is worth a thousand words)
■
Modeling the data (uncovering relationships and testing hypotheses)
■
Preparing the results (creating publication-quality tables and graphs)
Then I tried to understand how I could use R to accomplish each of these tasks.
Because I learn best by teaching, I eventually created a website (www.statmethods.net)
to document what I had learned.
Then, about a year later, Marjan Bace, Manning’s publisher, called and asked if I
would like to write a book on R. I had already written 50 journal articles, 4 technical
manuals, numerous book chapters, and a book on research methodology, so how
hard could it be? At the risk of sounding repetitive—famous last words.
A year after the first edition came out in 2011, I started working on the second edi-
tion. The R platform is evolving, and I wanted to describe these new developments. I
also wanted to expand the coverage of predictive analytics and data mining—impor-
tant topics in the world of big data. Finally, I wanted to add chapters on advanced data
visualization, software development, and dynamic report writing.
The book you’re holding is the one that I wished I had so many years ago. I have
tried to provide you with a guide to R that will allow you to quickly access the power of
this great open source endeavor, without all the frustration and angst. I hope you
enjoy it.
P.S. I was offered the job but didn’t take it. But learning R has taken my career in
directions that I could never have anticipated. Life can be funny.
Licensed to Mark Watson <nordickan@gmail.com>