61.Introduction
of least squares, which implemented the earliest form of what is now known
as linear regression. The approach was first successfully applied to problems
in astronomy. Linear regression is us ed for predicting quantitative values,
such as an individual’s salary. In or der to predict qualitative values, such as
whether a patient survives or dies, or whether the stock market increases
or decreases, Fisher proposed linear discriminant analysis in 1936. In the
1940s, various authors put forth an alternative approach, logistic regression.
In the early 1970s, Ne lde r and Wedderburn coined the term generalized
linear models fo r an entire class of statistical learning methods that include
both linear and logistic regr ession as special cases.
By the end of the 1970s, many more techniques for learning from data
were av ailable. However, they were almost exclusively line ar methods, be-
cause fitting non-linear relationships was computationally infeasible at the
time. By the 198 0s, computing technology had fina lly improved sufficiently
that non-linear methods were no longer computationally prohibitive. In mid
1980s Breiman, Friedman, Olshen and Stone introduc ed classification and
regression trees,andwereamongthefirsttodemonstratethepowerofa
detailed practical implementation of a method, including cross-validation
for model selection. Hastie and Tibshirani coined the term generalized addi-
tive models in 1986 for a class of non-linear extensions to generalized linear
models, and also provided a practical software implementation.
Since that time, inspired by the advent of machine learning and other
disciplines, statistical learning has emerged as a new subfield in statistics,
focused on supervised and unsupervisedmodelingandprediction.Inrecent
years, progress in statistical learning has been mar ked by the incr easing
availability o f p owerful and relatively user-friendly s oftware, such as the
popular and freely available
R system. This has the potential to continue
the transformation of the field from a set of techniques used and developed
by statisticians and computer scientists to an essential toolkit for a much
broader community.
This Book
The Elements of Statistical Learning (ESL) by Hastie, Tibshirani, and
Friedman was first published in 2001. Since tha t time, it has become an
important reference on the fundamentals of statistical machine learning.
Its success derives from its comprehensive and detailed treatment of many
important topics in statistical learning, as well as the fact that (relative to
many upper-level statistics textbooks) it is accessible to a wide audience.
However, the greatest factor behindthesuccessofESLhasbeenitstopical
nature. At the time of its publication, interest in the field of statistical