Testing the potential of deep learning presents unique challenges because any single application
brings together various disciplines. Applying deep learning requires simultaneously understand-
ing (i) the motivations for casting a problem in a particular way; (ii) the mathematics of a given
modeling approach; (iii) the optimization algorithms for tting the models to data; and (iv) the
engineering required to train models eciently, navigating the pitfalls of numerical computing
and getting the most out of available hardware. Teaching both the critical thinking skills required
to formulate problems, the mathematics to solve them, and the soware tools to implement those
solutions all in one place presents formidable challenges. Our goal in this book is to present a
unied resource to bring would-be practitioners up to speed.
At the time we started this book project, there were no resources that simultaneously (i) were
up to date; (ii) covered the full breadth of modern machine learning with substantial technical
depth; and (iii) interleaved exposition of the quality one expects from an engaging textbook with
the clean runnable code that one expects to nd in hands-on tutorials. We found plenty of code
examples for how to use a given deep learning framework (e.g., how to do basic numerical com-
puting with matrices in TensorFlow) or for implementing particular techniques (e.g., code snip-
pets for LeNet, AlexNet, ResNets, etc) scattered across various blog posts and GitHub repositories.
However, these examples typically focused on how to implement a given approach, but le out the
discussion of why certain algorithmic decisions are made. While some interactive resources have
popped up sporadically to address a particular topic, e.g., the engaging blog posts published on
the website Distill
3
, or personal blogs, they only covered selected topics in deep learning, and
oen lacked associated code. On the other hand, while several textbooks have emerged, most no-
tably (Goodfellow et al., 2016), which oers a comprehensive survey of the concepts behind deep
learning, these resources do not marry the descriptions to realizations of the concepts in code,
sometimes leaving readers clueless as to how to implement them. Moreover, too many resources
are hidden behind the paywalls of commercial course providers.
We set out to create a resource that could (i) be freely available for everyone; (ii) oer sucient
technical depth to provide a starting point on the path to actually becoming an applied machine
learning scientist; (iii) include runnable code, showing readers how to solve problems in practice;
(iv) allow for rapid updates, both by us and also by the community at large; and (v) be comple-
mented by a forum
4
for interactive discussion of technical details and to answer questions.
These goals were oen in conict. Equations, theorems, and citations are best managed and laid
out in LaTeX. Code is best described in Python. And webpages are native in HTML and JavaScript.
Furthermore, we want the content to be accessible both as executable code, as a physical book,
as a downloadable PDF, and on the Internet as a website. At present there exist no tools and no
workow perfectly suited to these demands, so we had to assemble our own. We describe our
approach in detail in Section 19.6. We settled on GitHub to share the source and to allow for edits,
Jupyter notebooks for mixing code, equations and text, Sphinx as a rendering engine to generate
multiple outputs, and Discourse for the forum. While our system is not yet perfect, these choices
provide a good compromise among the competing concerns. We believe that this might be the
rst book published using such an integrated workow.
3
http://distill.pub
4
http://discuss.d2l.ai
2 Contents