没有合适的资源?快使用搜索试试~ 我知道了~
首页《Deep Learning》(Ian Goodfellow, Yoshua Bengio)
《Deep Learning》(Ian Goodfellow, Yoshua Bengio)
需积分: 20 19 下载量 22 浏览量
更新于2023-03-16
评论
收藏 16.73MB PDF 举报
非常好的深度学习工具书!Google深度学习大牛Ian Goodfellow和Yoshua Bengio的新书,前者也就是GAN的发明者。
资源详情
资源评论
资源推荐
Deep Learning
Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents
Website vii
Acknowledgments viii
Notation xi
1 Introduction 1
1.1 Who Should Read This Book? ...... ......... ..... 8
1.2 Historical Trends in Deep Learning ......... ........ 11
I Applied Math and Machine Learning Basics 29
2 Linear Algebra 31
2.1 Scalars, Vectors, Matrices and Tensors ......... ...... 31
2.2 Multiplying Matrices and Vectors ......... ........ . 34
2.3 Identity and Inverse Matrices ......... ........ ... 36
2.4 Linear Dependence and Span ......... ........ ... 37
2.5 Norms ......... ........ ........ ........ 39
2.6 Special Kinds of Matrices and Vectors ............... 40
2.7 Eigendecomposition .......... ........ ........ 42
2.8 Singular Value Decomposition ........ ........ .... 44
2.9 The Moore-Penrose Pseudoinverse ......... ........ . 45
2.10 The Trace Operator ......... ........ ........ 46
2.11 The Determinant .. ........ ........ ......... 47
2.12 Example: Principal Components Analysis ......... .... 48
3 Probability and Information Theory 53
3.1 Why Probability? ..... ......... ........ ..... 54
i
CONTENTS
3.2 Random Variables ..... ........ ......... .... 56
3.3 Probability Distributions ......... ........ ...... 56
3.4 Marginal Probability ......... ......... ....... 58
3.5 Conditional Probability .. ........ ........ ..... 59
3.6 The Chain Rule of Conditional Probabilities ......... ... 59
3.7 Independence and Conditional Independence ......... ... 60
3.8 Expectation, Variance and Covariance .......... ..... 60
3.9 Common Probability Distributions ............... .. 62
3.10 Useful Properties of Common Functions ... ......... .. 67
3.11 Bayes’ Rule .......... ........ ........ .... 70
3.12 Technical Details of Continuous Variables ...... ....... 71
3.13 Information Theory .......... ........ ........ 73
3.14 Structured Probabilistic Models .... ........ ....... 75
4 Numerical Computation 80
4.1 Overflow and Underflow ......... ........ ...... 80
4.2 Poor Conditioning ......... ........ ......... 82
4.3 Gradient-Based Optimization ....... ........ ..... 82
4.4 Constrained Optimization ............. ........ . 93
4.5 Example: Linear Least Squares ....... ......... ... 96
5 Machine Learning Basics 98
5.1 Learning Algorithms ........... ........ ...... 99
5.2 Capacity, Overfitting and Underfitting .. ........ ..... 110
5.3 Hyperparameters and Validation Sets . ........ ....... 120
5.4 Estimators, Bias and Variance ...... ........ ...... 122
5.5 Maximum Likelihood Estimation ...... ......... ... 131
5.6 Bayesian Statistics ........... ........ ....... 135
5.7 Supervised Learning Algorithms ... ........ ........ 140
5.8 Unsupervised Learning Algorithms ............... .. 146
5.9 Stochastic Gradient Descent .... ......... ........ 151
5.10 Building a Machine Learning Algorithm ............. . 153
5.11 Challenges Motivating Deep Learning ..... ......... .. 155
II Deep Networks: Modern Practices 166
6 Deep Feedforward Networks 168
6.1 Example: Learning XOR . ......... ........ ..... 171
6.2 Gradient-Based Learning . ........ ........ ...... 177
ii
CONTENTS
6.3 Hidden Units ...... ........ ......... ...... 191
6.4 Architecture Design ......... ........ ........ . 197
6.5 Back-Propagation and Other Differentiation Algorithms ..... 204
6.6 Historical Notes ....... ........ ......... .... 224
7 Regularization for Deep Learning 228
7.1 Parameter Norm Penalties ..... ......... ........ 230
7.2 Norm Penalties as Constrained Optimization ........ .... 237
7.3 Regularization and Under-Constrained Problems .. ....... 239
7.4 Dataset Augmentation .......... ......... ..... 240
7.5 Noise Robustness ......... ........ ........ .. 242
7.6 Semi-Supervised Learning ................ ...... 243
7.7 Multi-Task Learning .............. ......... .. 244
7.8 Early Stopping ......... ........ ........ ... 246
7.9 Parameter Tying and Parameter Sharing.............. 253
7.10 Sparse Representations ......... ........ ....... 254
7.11 Bagging and Other Ensemble Methods . ......... ..... 256
7.12 Dropout ........ ......... ........ ....... 258
7.13 Adversarial Training ........ ......... ........ 268
7.14 Tangent Distance, Tangent Prop, and Manifold Tangent Classifier 270
8 Optimization for Training Deep Models 274
8.1 How Learning Differs from Pure Optimization ........... 275
8.2 Challenges in Neural Network Optimization ..... ....... 282
8.3 Basic Algorithms ............. ........ ...... 294
8.4 Parameter Initialization Strategies . ......... ....... 301
8.5 Algorithms with Adaptive Learning Rates ....... ...... 306
8.6 Approximate Second-Order Methods .... ......... ... 310
8.7 Optimization Strategies and Meta-Algorithms ..... ...... 317
9 Convolutional Networks 330
9.1 The Convolution Operation ................ ..... 331
9.2 Motivation .. ........ ......... ........ .... 335
9.3 Pooling ............. ........ ......... ... 339
9.4 Convolution and Pooling as an Infinitely Strong Prior .. ..... 345
9.5 Variants of the Basic Convolution Function ............ 347
9.6 Structured Outputs . ........ ......... ........ 358
9.7 Data Types ...... ........ ........ ........ 360
9.8 Efficient Convolution Algorithms ........ ........ .. 362
9.9 Random or Unsupervised Features ........ ........ . 363
iii
CONTENTS
9.10 The Neuroscientific Basis for Convolutional Networks ...... . 364
9.11 Convolutional Networks and the History of Deep Learning .... 371
10Sequence Modeling: Recurrent and Recursive Nets 373
10.1 Unfolding Computational Graphs ............. ..... 375
10.2 Recurrent Neural Networks ... ......... ........ . 378
10.3 Bidirectional RNNs.............. ......... ... 394
10.4 Encoder-Decoder Sequence-to-Sequence Architectures ...... . 396
10.5 Deep Recurrent Networks ........ ......... ..... 398
10.6 Recursive Neural Networks ..... ......... ........ 400
10.7 The Challenge of Long-Term Dependencies .......... ... 401
10.8 Echo State Networks .......... ......... ...... 404
10.9 Leaky Units and Other Strategies for Multiple Time Scales ... . 406
10.10The Long Short-Term Memory and Other Gated RNNs .. .... 408
10.11Optimization for Long-Term Dependencies ........ ..... 413
10.12Explicit Memory .......... ......... ........ 416
11Practical Methodology 421
11.1 Performance Metrics .......... ........ ....... 422
11.2 Default Baseline Models ........ ........ ....... 425
11.3 Determining Whether to Gather More Data ............ 426
11.4 Selecting Hyperparameters ......... ........ ..... 427
11.5 Debugging Strategies ..... ........ ......... ... 436
11.6 Example: Multi-Digit Number Recognition ..... ........ 440
12Applications 443
12.1 Large-Scale Deep Learning . ........ ........ ..... 443
12.2 Computer Vision ......... ........ ........ .. 452
12.3 Speech Recognition...... ........ ......... ... 458
12.4 Natural Language Processing ... ........ ........ . 461
12.5 Other Applications ......... ........ ........ . 478
III Deep Learning Research 486
13Linear Factor Models 489
13.1 Probabilistic PCA and Factor Analysis ....... ........ 490
13.2 Independent Component Analysis (ICA) ............ .. 491
13.3 Slow Feature Analysis ...... ......... ........ . 493
13.4 Sparse Coding ...... ........ ......... ...... 496
iv
剩余799页未读,继续阅读
jeese888
- 粉丝: 0
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
会员权益专享
最新资源
- 数据结构1800题含完整答案详解.doc
- 医疗企业薪酬系统设计与管理方案.pptx
- 界面与表面技术界面理论与表面技术要点PPT学习教案.pptx
- Java集合排序及java集合类详解(Collection、List、Map、Set)讲解.pdf
- 网页浏览器的开发 (2).pdf
- 路由器原理与设计讲稿6-交换网络.pptx
- 火电厂锅炉过热汽温控制系统设计.doc
- 企业识别CIS系统手册[收集].pdf
- 物业管理基础知识.pptx
- 第4章财务预测.pptx
- 《集成电路工艺设计及器件特性分析》——实验教学计算机仿真系.pptx
- 局域网内共享文件提示没有访问权限的问题借鉴.pdf
- 第5章网络营销策略.pptx
- 固井质量测井原理PPT教案.pptx
- 毕业实习总结6篇.doc
- UGNX建模基础篇草图模块PPT学习教案.pptx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0