统计学习基础第二版：数据驱动的机器学习经典

Statistical

Learning

需积分: 3 68 浏览量更新于2024-07-21 收藏 12.2MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"《统计学习要素》第二版是机器学习领域的一本经典著作，由Hastie、Tibshirani和Friedman三位作者撰写。该书在第一版大受欢迎的基础上，随着研究领域的快速发展，作者们决定推出第二版以反映最新的研究成果。新版本共增加了四章，并对原有章节进行了更新，以保持内容的时效性。新版本的主要变化包括： 1. 在引言部分，作者引用了威廉·爱德华兹·戴明的名言：“我们信任上帝，其他人则带来数据。”尽管这个名言在网络上传播广泛，但据Hayden教授所述，他并未原创此言，而关于戴明是否真的说过这句话的“数据”证据难以寻觅，这体现了统计学习领域研究中对数据可靠性的重视。 2. 新增的章节涵盖了全新的主题，这些内容反映了近年来统计学习理论和技术的扩展。例如，可能有章节探讨了深度学习、大数据分析、模型选择与交叉验证的新进展，以及在高维数据处理中的新颖方法。 3. 对于已有的章节，作者可能重新审视并整合了最新的研究成果，确保内容的精确性和实用性。比如，章节可能更新了特征选择和降维技术，介绍了更高效的算法和模型优化策略。 4. 为了保持读者的阅读流畅性，尽管第二版有所扩展，但作者尽量保持原有的结构框架不变，仅在必要处进行调整，以便让熟悉第一版的读者能快速定位和理解新内容。 5. 此次出版不仅是一次技术的更新，也可能是对教学方法的反思，强调了统计学习理论在实际应用中的重要性，以及如何将理论知识转化为解决现实问题的能力。《统计学习要素》第二版是对第一版的有益补充，旨在帮助读者紧跟机器学习领域的前沿动态，提升理解和应用统计学习技术的水平。无论你是初学者还是专业人士，这本书都是深入理解复杂数据分析方法和模型构建的宝贵资源。"

资源详情

资源推荐

xx Contents

14 Unsupervised Learning 485

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 485

14.2 Association Rules . . . . . . . . . . . . . . . . . . . . . . 487

14.2.1 Market Basket Analysis . . . . . . . . . . . . . . 488

14.2.2 The Apriori Algorithm . . . . . . . . . . . . . . 489

14.2.3 Example: Market Basket Analysis . . . . . . . . 492

14.2.4 Unsupervised as Supervised Learning . . . . . . 495

14.2.5 Generalized Association Rules . . . . . . . . . . 497

14.2.6 Choice of Supervised Learning Method . . . . . 499

14.2.7 Example: Market Basket Analysis (Continued) . 499

14.3 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . 501

14.3.1 Proximity Matrices . . . . . . . . . . . . . . . . 503

14.3.2 Dissimilarities Based on Attributes . . . . . . . 503

14.3.3 Object Dissimilarity . . . . . . . . . . . . . . . . 505

14.3.4 Clustering Algorithms . . . . . . . . . . . . . . . 507

14.3.5 Combinatorial Algorithms . . . . . . . . . . . . 507

14.3.6 K-means . . . . . . . . . . . . . . . . . . . . . . 509

14.3.7 Gaussian Mixtures as Soft K-means Clustering . 510

14.3.8 Example: Human Tumor Microarray Data . . . 512

14.3.9 Vector Quantization . . . . . . . . . . . . . . . . 514

14.3.10 K-medoids . . . . . . . . . . . . . . . . . . . . . 515

14.3.11 Practical Issues . . . . . . . . . . . . . . . . . . 518

14.3.12 Hierarchical Clustering . . . . . . . . . . . . . . 520

14.4 Self-Organizing Maps . . . . . . . . . . . . . . . . . . . . 528

14.5 Principal Components, Curves and Surfaces . . . . . . . . 534

14.5.1 Principal Components . . . . . . . . . . . . . . . 534

14.5.2 Principal Curves and Surfaces . . . . . . . . . . 541

14.5.3 Spectral Clustering . . . . . . . . . . . . . . . . 544

14.5.4 Kernel Principal Components . . . . . . . . . . . 547

14.5.5 Sparse Principal Components . . . . . . . . . . . 550

14.6 Non-negative Matrix Factorization . . . . . . . . . . . . . 553

14.6.1 Archetypal Analysis . . . . . . . . . . . . . . . . 554

14.7 Independent Component Analysis

and Exploratory Projection Pursuit . . . . . . . . . . . . 557

14.7.1 Latent Variables and Factor Analysis . . . . . . 558

14.7.2 Independent Component Analysis . . . . . . . . 560

14.7.3 Exploratory Projection Pursuit . . . . . . . . . . 565

14.7.4 A Direct Approach to ICA . . . . . . . . . . . . 565

14.8 Multidimensional Scaling . . . . . . . . . . . . . . . . . . 570

14.9 Nonlinear Dimension Reduction

and Local Multidimensional Scaling . . . . . . . . . . . . 572

14.10 The Google PageRank Algorithm . . . . . . . . . . . . . 576

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 578

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579

Contents xxi

15 Random Forests 587

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 587

15.2 Deﬁnition of Random Forests . . . . . . . . . . . . . . . . 587

15.3 Details of Random Forests . . . . . . . . . . . . . . . . . 592

15.3.1 Out of Bag Samples . . . . . . . . . . . . . . . . 592

15.3.2 Variable Importance . . . . . . . . . . . . . . . . 593

15.3.3 Proximity Plots . . . . . . . . . . . . . . . . . . 595

15.3.4 Random Forests and Overﬁtting . . . . . . . . . 596

15.4 Analysis of Random Forests . . . . . . . . . . . . . . . . . 597

15.4.1 Variance and the De-Correlation Eﬀect . . . . . 597

15.4.2 Bias . . . . . . . . . . . . . . . . . . . . . . . . . 600

15.4.3 Adaptive Nearest Neighbors . . . . . . . . . . . 601

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 602

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603

16 Ensemble Learning 605

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 605

16.2 Boosting and Regularization Paths . . . . . . . . . . . . . 607

16.2.1 Penalized Regression . . . . . . . . . . . . . . . 607

16.2.2 The “Bet on Sparsity” Principle . . . . . . . . . 610

16.2.3 Regularization Paths, Over-ﬁtting and Margins . 613

16.3 Learning Ensembles . . . . . . . . . . . . . . . . . . . . . 616

16.3.1 Learning a Good Ensemble . . . . . . . . . . . . 617

16.3.2 Rule Ensembles . . . . . . . . . . . . . . . . . . 622

Bibliographic Notes . . . . . . . . . . . . . . . . . . . . . . . . . 623

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

17 Undirected Graphical Models 625

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 625

17.2 Markov Graphs and Their Properties . . . . . . . . . . . 627

17.3 Undirected Graphical Models for Continuous Variables . 630

17.3.1 Estimation of the Parameters

when the Graph Structure is Known . . . . . . . 631

17.3.2 Estimation of the Graph Structure . . . . . . . . 635

17.4 Undirected Graphical Models for Discrete Variables . . . 638

17.4.1 Estimation of the Parameters

when the Graph Structure is Known . . . . . . . 639

17.4.2 Hidden Nodes . . . . . . . . . . . . . . . . . . . 641

17.4.3 Estimation of the Graph Structure . . . . . . . . 642

17.4.4 Restricted Boltzmann Machines . . . . . . . . . 643

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645

18 High-Dimensional Problems: p ≫ N 649

18.1 When p is Much Bigger than N . . . . . . . . . . . . . . 649

2 1. Introduction

TABLE 1.1. Average percentage of words or characters in an email message

equal to the indicated word or character. We have chosen the words and characters

showing the largest diﬀerence between spam and email.

george you your hp free hpl ! our re edu remove

spam 0.00 2.26 1.38 0.02 0.52 0.01 0.51 0.51 0.13 0.01 0.28

1.27 1.27 0.44 0.90 0.07 0.43 0.11 0.18 0.42 0.29 0.01

measurements for a set of objects (such as people). Using this data we build

a prediction model, or learner, which will enable us to predict the outcome

for new unseen objects. A good learner is one that accurately predicts such

an outcome.

The examples above describe what is called the supervised learning prob-

lem. It is called “supervised” because of the presence of the outcome vari-

able to guide the learning process. In the unsupervised learning problem,

we observe only the features and have no measurements of the outcome.

Our task is rather to describe how the data are organized or clustered. We

devote most of this book to supervised learning; the unsupervised problem

is less developed in the literature, and is the focus of Chapter 14.

Here are some examples of real learning problems that are discussed in

this book.

Example 1: Email Spam

The data for this example consists of information from 4601 email mes-

sages, in a study to try to predict whether the email was junk email, or

“spam.” The objective was to design an automatic spam detector that

could ﬁlter out spam before clogging the users’ mailboxes. For all 4601

email messages, the true outcome (email type) email or spam is available,

along with the relative frequencies of 57 of the most commonly occurring

words and punctuation marks in the email message. This is a supervised

learning problem, with the outcome the class variable email/spam. It is also

called a classiﬁcation problem.

Table 1.1 lists the words and characters showing the largest average

diﬀerence between spam and email.

Our learning method has to decide which features to use and how: for

example, we might use a rule such as

if (%george < 0.6) & (%you > 1.5) then spam

else email.

Another form of a rule might be:

if (0.2 · %you − 0.3 · %george) > 0 then spam

else email.

剩余762页未读，继续阅读

denny2015

粉丝: 131
资源: 12

统计学习基础第二版：数据驱动的机器学习经典

（ESL高清彩色英文+中文）The Elements of Statistical Learning

The Elements of Statistical Learning.pdf

统计学习基础英文第二版（勘误2017年1月第12次勘误） The Elements of Statistical Learning

统计学习基础：The Elements of Statistical Learning 2nd版解析

The Elements of Statistical Learning: 数据挖掘与预测的统计理论

The Elements of Statistical Learning修订版：去除边框，优化打印

the elements of statistical learning 第四章答案

the elements of statistical learning

The Elements of Statistical Learning 引用

the elements of statistical learning资源

the elements of statistical learning, volume 1. springer series in statistic

监督学习算法与非监督学习算法的文献。

用python中re正则表达式编程实现a.按参考文献标题排序 b.按出版年份排序

欧式距离学习参考文献

请你就机器学习的过拟合和欠拟合写一段文字要求引用三篇权威文献并在文字底部标明出处

图像处理与深度学习相关书籍与作者

esl,prml 和 mlapp

介绍logistic regression，需要参考文献

SVM的提出的参考文献

机器学习入门推荐书籍

最新资源