没有合适的资源？快使用搜索试试~ 我知道了~

首页An Introduction to Deep Reinforcement Learning

An Introduction to Deep Reinforcement Learning

强化学习

深度学习

需积分: 10 28 下载量 187 浏览量更新于2023-03-16 评论收藏 2.46MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

试读

140页

介绍深度强化学习的教材，非常实用。摘要：Deep reinforcement learning is the combination of reinforce- ment learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision- making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts.

资源详情

资源评论

资源推荐

An Introduction to Deep

Reinforcement Learning

Vincent François-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare and Joelle

Pineau (2018), “An Introduction to Deep Reinforcement Learning”, Foundations and

Trends in Machine Learning: Vol. 11, No. 3-4. DOI: 10.1561/2200000071.

Vincent François-Lavet

McGill University

vincent.francois-lavet@mcgill.ca

Peter Henderson

McGill University

peter.henderson@mail.mcgill.ca

Riashat Islam

McGill University

riashat.islam@mail.mcgill.ca

Marc G. Bellemare

Google Brain

bellemare@go ogle.com

Joelle Pineau

Faceb ook, McGill University

jpineau@cs.mcgill.ca

Boston — Delft

arXiv:1811.12560v2 [cs.LG] 3 Dec 2018

Contents

1 Introduction 2

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Machine learning and deep learning 6

2.1 Supervised learning and the concepts of bias and overﬁtting 7

2.2 Unsupervised learning . . . . . . . . . . . . . . . . . . . . 9

2.3 The deep learning approach . . . . . . . . . . . . . . . . . 10

3 Introduction to reinforcement learning 15

3.1 Formal framework . . . . . . . . . . . . . . . . . . . . . . 16

3.2 Diﬀerent components to learn a policy . . . . . . . . . . . 20

3.3 Diﬀerent settings to learn a policy from data . . . . . . . . 21

4 Value-based methods for deep RL 24

4.1 Q-learning . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Fitted Q-learning . . . . . . . . . . . . . . . . . . . . . . 25

4.3 Deep Q-networks . . . . . . . . . . . . . . . . . . . . . . 27

4.4 Double DQN . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.5 Dueling network architecture . . . . . . . . . . . . . . . . 29

4.6 Distributional DQN . . . . . . . . . . . . . . . . . . . . . 31

4.7 Multi-step learning . . . . . . . . . . . . . . . . . . . . . . 32

4.8

Combination of all DQN improvements and variants of DQN

5 Policy gradient methods for deep RL 36

5.1 Stochastic Policy Gradient . . . . . . . . . . . . . . . . . 37

5.2 Deterministic Policy Gradient . . . . . . . . . . . . . . . . 39

5.3 Actor-Critic Methods . . . . . . . . . . . . . . . . . . . . 40

5.4 Natural Policy Gradients . . . . . . . . . . . . . . . . . . 42

5.5 Trust Region Optimization . . . . . . . . . . . . . . . . . 43

5.6 Combining policy gradient and Q-learning . . . . . . . . . 44

6 Model-based methods for deep RL 46

6.1 Pure model-based methods . . . . . . . . . . . . . . . . . 46

6.2 Integrating model-free and model-based methods . . . . . 49

7 The concept of generalization 53

7.1 Feature selection . . . . . . . . . . . . . . . . . . . . . . . 58

7.2

Choice of the learning algorithm and function approximator

selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7.3 Modifying the objective function . . . . . . . . . . . . . . 61

7.4 Hierarchical learning . . . . . . . . . . . . . . . . . . . . . 62

7.5 How to obtain the best bias-overﬁtting tradeoﬀ . . . . . . 63

8 Particular challenges in the online setting 66

8.1 Exploration/Exploitation dilemma . . . . . . . . . . . . . . 66

8.2 Managing experience replay . . . . . . . . . . . . . . . . . 71

9 Benchmarking Deep RL 73

9.1 Benchmark Environments . . . . . . . . . . . . . . . . . . 73

9.2 Best practices to benchmark deep RL . . . . . . . . . . . 78

9.3 Open-source software for Deep RL . . . . . . . . . . . . . 80

10 Deep reinforcement learning beyond MDPs 81

10.1 Partial observability and the distribution of (related) MDPs 81

10.2 Transfer learning . . . . . . . . . . . . . . . . . . . . . . . 86

10.3 Learning without explicit reward function . . . . . . . . . . 89

10.4 Multi-agent systems . . . . . . . . . . . . . . . . . . . . . 91

11 Perspectives on deep reinforcement learning 94

11.1 Successes of deep reinforcement learning . . . . . . . . . . 94

11.2

Challenges of applying reinforcement learning to real-world

problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

11.3 Relations between deep RL and neuroscience . . . . . . . . 96

12 Conclusion 99

12.1 Future development of deep RL . . . . . . . . . . . . . . . 99

12.2 Applications and societal impact of deep RL . . . . . . . . 100

Appendices 103

References 106

An Introduction to Deep

Reinforcement Learning

Vincent François-Lavet

, Peter Henderson

, Riashat Islam

, Marc

G. Bellemare

and Joelle Pineau

McGill University; vincent.francois-lavet@mcgill.ca

McGill University; peter.henderson@mail.mcgill.ca

McGill University; riashat.islam@mail.mcgill.ca

Google Brain; bellemare@google.com

Facebook, McGill University; jpineau@cs.mcgill.ca

ABSTRACT

Deep reinforcement learning is the combination of reinforce-

ment learning (RL) and deep learning. This ﬁeld of research

has been able to solve a wide range of complex decision-

making tasks that were previously out of reach for a machine.

Thus, deep RL opens up many new applications in domains

such as healthcare, robotics, smart grids, ﬁnance, and many

more. This manuscript provides an introduction to deep

reinforcement learning models, algorithms and techniques.

Particular focus is on the aspects related to generalization

and how deep RL can be used for practical applications. We

assume the reader is familiar with basic machine learning

concepts.

剩余139页未读，继续阅读

We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.

粉丝: 57
资源: 15

上传资源快速赚钱

我的内容管理收起

我的资源快来上传第一个资源

我的收益

登录查看自己的收益

我的积分登录查看自己的积分

我的C币登录后查看C币余额

我的收藏

我的下载

下载帮助

会员权益专享

An Introduction to Deep Reinforcement Learning

评论0

会员权益专享

最新资源

An Introduction to Deep Reinforcement Learning

评论0

2017强化学习英文最新综述 Deep Reinforcement Learning: An Overview

Reinforcement Learning: An Introduction最新版习题解答（第一版本）

DEEP REINFORCEMENT LEARNING

帮助学习强化学习的书籍

reinforcement learning : an introduction

tensorflow 强化学习 书

强化学习和知识图谱的学习资料

适合人工智能深度学习的参考文献

python强化学习书籍

多智能体强化学习推荐书籍

推荐一本强化学习的教材

对于深度强化学习的学习应该看哪些资料

可以推荐强化学习的书籍吗

推荐10本神经网络的教材

推荐一下学习AI高阶阶段的书籍

给出RL——Policy Gradient的解释和资料

使用这个方法的学习资料

用python写用深度强化学习算法控制倒立摆的代码

关于深度学习的外文文献

人工智能十大经典文献

会员权益专享

最新资源

tensorflow 强化学习书