深度学习入门： Ian Goodfellow、Yoshua Bengio与Aaron Courville详解

需积分: 9 39 浏览量更新于2024-07-19 收藏 24.39MB PDF 举报

《深度学习》(Deep Learning) 是一本由伊恩·古德费洛(Ian Goodfellow)、约书亚·本吉奥(Yoshua Bengio)和亚伦·库维尔(Aaron Courville)合著的经典之作，它详尽地探讨了深度学习领域的核心理论和技术。该书以深入浅出的方式介绍了深度学习的历史背景和发展趋势，同时涵盖了应用数学和机器学习基础的核心概念。在第一部分，作者首先针对读者群体进行了介绍，指出这本书不仅适合专业研究人员，也对希望了解这一领域基本原理的工程师和学生具有价值。他们追溯了深度学习自20世纪50年代以来的发展脉络，强调了近年来随着计算能力的提升和技术突破，深度学习的显著崛起。第二部分，" Applied Math and Machine Learning Basics"，是理论基础的核心章节。从线性代数开始，书中详细解释了向量、矩阵和张量的基本概念，以及它们在深度学习中的作用。读者可以在这里学习到矩阵乘法、单位矩阵和逆矩阵的性质，理解线性依赖与span的概念，以及不同类型的矩阵和向量的特性。此外，作者还讲解了特征值分解、奇异值分解(SVD)以及 Moore-Penrose 倒数等关键运算，这些都是深度学习模型训练中的基础工具。接着，概率论和信息论是另一大支柱，这部分探讨了概率在数据处理中的重要性，包括随机变量、概率分布、边缘概率、条件概率、链式规则和独立性等概念。这些理论对于理解深度学习中的不确定性建模和优化至关重要，例如在神经网络中的贝叶斯推理和正则化。此外，书中的一个重要例子——主成分分析(PCA)，展示了这些数学工具如何应用于实际的数据降维问题，这对于特征工程和模型简洁性有着直接关联。在本书的后续章节中，作者还会深入到神经网络的结构、反向传播算法、激活函数、优化方法以及深度学习的各种架构，如卷积神经网络(CNN)、循环神经网络(RNN)等。整体来说，《深度学习》旨在为读者提供一个扎实的理论基础，使他们能够理解和构建复杂的深度学习系统。无论你是深度学习新手还是经验丰富的研究者，这本书都是一份宝贵的资源。

CONTENTS

unctions

→

The

function

with

domain

and

range

◦

Comp

osition

the

functions

and

(

;

)

function

parametrized

Sometimes

just

write

(

)

and

ignore

the

argument

ligh

ten

notation.

log

Natural

logarithm

(

)

Logistic

sigmoid,

exp(

)

−

(

)

log

exp(

Softplus,

))

norm

ositiv

part

i.e.,

max(0

)

condition

the

condition

true,

otherwise

Sometimes

use

function

whose

argumen

scalar,

but

apply

ector,

matrix,

tensor:

(

)

(

)

(

)

This

means

apply

the

arra

elemen

t-wise.

example,

(

)

then

i,j,k

(

i,j,k

)

for

all

alid

alues

and

Datasets

and

distributions

data

The

data

generating

distribution

data

The

empirical

distribution

deﬁned

the

training

set

training

examples

(

)

The

-th

example

(input)

from

dataset

(

)

(

)

The

target

asso

ciated

with

(

)

for

sup

ervised

learn-

ing

The

matrix

with

input

example

(

)

xiv

CHAPTER

INTR

ODUCTION

concepts

are

built

top

eac

other,

the

graph

deep,

with

man

lay

ers.

this

reason,

call

this

approach

arning

Man

the

early

successes

took

place

relativ

ely

sterile

and

formal

vironmen

and

did

not

require

computers

kno

wledge

out

the

orld. F

example,

IBM’s

Deep

Blue

chess-pla

ying

system

defeated

orld

hampion

Garry

Kasparo

1997

(

Chess

course

very

simple

Hsu

2002

orld,

con

taining

only

sixt

y-four

cations

and

thirt

y-t

pieces

that

can

mov

only

rigidly

circumscrib

ys.

Devising

successful

chess

strategy

is a

tremendous

accomplishmen

t, but

the

challenge

not

due

the

diﬃculty

describing

the

set

chess

pieces

and

allo

able

mov

the

computer.

Chess

can

completely

describ

very

brief

list

completely

formal

rules,

easily

pro

vided

ahead

time

the

programmer.

Ironically

abstract

and

formal

tasks

that

are

among

the

most

diﬃcult

mental

undertakings

for

uman

eing

are

among

the

easiest

for

computer.

Computers

long

een

able

defeat

even

the

est

human

chess

play

er,

but

are

only

recen

tly

matching

some

the

abilities

erage

human

eings

recognize

jects

eech.

erson’s

everyda

life

requires

immense

amount

kno

wledge

out

the

world.

Much

this

kno

wledge

sub

jectiv

and

intuitiv

and

therefore

diﬃcult

articulate

formal

Computers

need

capture

this

same

kno

wledge

order

eha

telligen

One

the

key

challenges

artiﬁcial

telligence

how

get

this

informal

kno

wledge

into

computer.

Sev

eral

artiﬁcial

telligence

pro

jects

hav

sought

hard-co

knowledge

out

the

worl

formal

languages.

computer

can

reason

out

statements

these

formal

languages

automatically

using

logical

inference

rules.

This

kno

the

know

dge

ase

approac

artiﬁcial

intelligence.

None

these

pro

jects

has

led

jor

success.

One

the

most

famous

such

pro

jects

Cyc

(

Lenat

and

Guha

1989

Cyc

inference

engine

and

database

statements

language

called

CycL.

These

statements

are

tered

staﬀ

human

sup

ervisors.

wieldy

pro

cess.

People

struggle

devise

formal

rules

with

enough

complexity

accurately

describ

the

world.

example,

Cyc

failed

understand

story

out

erson

named

red

shaving

the

morning

(

Its

inference

Linde

1992

engine

detected

inconsistency

the

story: it

knew

that

eople

not

electrical

parts,

but

ecause

red

holding

electric

razor,

elieved

the

tit

“F

redWhileShaving”

contained

electrical

parts.

therefore

ask

whether

red

was

still

erson

while

was

sha

ving.

The

diﬃculties

faced

systems

relying

hard-coded

kno

wledge

suggest

that

systems

need

the

ability

acquire

their

own

kno

wledge,

extracting

patterns

from

data.

This

capabilit

known

machine

arning

The

tro

duction

CHAPTER

INTR

ODUCTION

mac

hine

learning

allo

computers

tackle

problems

inv

olving

knowledge

the

real

orld

and

mak

decisions

that

app

ear

sub

jective.

simple

machine

learning

algorithm

called

gistic

ession

can

determine

whether

recommend

cesarean

deliv

ery

(

Mor-Y

osef

1990

al.

simple

machine

learning

algorithm

called

can

separate

legitimate

e-mail

from

spam

e-mail.

naive

Bayes

The

erformance

these

simple

machine

learning

algorithms

dep

ends

heavily

the

epr

esentation

the

data

they

are

given.

example,

when

logistic

regression

used

recommend

cesarean

deliv

ery

the

system

not

examine

the

patient

directly

Instead,

the

ctor

tells

the

system

several

pieces

relev

information,

suc

the

presence

absence

uterine

scar.

Each

piece

information

included

the

represen

tation

the

patient

known

atur

Logistic

regression

learns

eac

these

features

the

patient

correlates

with

arious

outcomes.

er,

cannot

inﬂuence

the

that

the

features

are

deﬁned

any

. If

logistic

regression

was

given

MRI

scan

the

patient,

rather

than

the

ctor’s

formalized

rep

ort,

would

not

able

mak

useful

predictions.

Individual

pixels

MRI

scan

negligible

correlation

with

complications

that

might

ccur

during

delivery

This

dep

endence

represen

tations

general

phenomenon

that

app

ears

throughout

computer

science

and

even

daily

life.

computer

science,

opera-

tions

suc

searching

collection

data

can

pro

ceed

exp

onentially

faster

the

collection

structured

and

indexed

intelligen

tly

. P

eople

can

easily

erform

arithmetic

Arabic

numerals,

but

ﬁnd

arithmetic

Roman

umerals

time-consuming.

not

surprising

that

the

choice

represen

tation

has

enormous

eﬀect

the

erformance

mac

hine

learning

algorithms.

simple

visual

example,

see

Fig.

1.1

Man

artiﬁcial

intelligence

tasks

can

solv

designing

the

righ

set

features

extract

for

that

task,

then

providing

these

features

simple

machine

learning

algorithm.

example,

useful

feature

for

eak

iden

tiﬁcation

from

sound

estimate

the

size

speaker’s

vocal

tract.

therefore

giv

strong

clue

whether

the

eaker

man,

oman,

hild.

er,

for

man

tasks,

diﬃcult

know

what

features

should

extracted.

example,

supp

ose

that

would

lik

write

program

detect

cars

photographs.

know

that

cars

wheels,

might

use

the

presence

wheel

feature. Unfortunately

diﬃcult

describ

exactly

what

wheel

oks

terms

pixel

alues.

wheel

has

simple

geometric

shap

but

its

image

may

complicated

shadows

falling

the

wheel,

the

sun

glaring

oﬀ

the

metal

parts

the

wheel,

the

fender

the

car

ject

the

foreground

obscuring

part

the

wheel,

and

on.

剩余802页未读，继续阅读

crulat

粉丝: 3

深度学习入门： Ian Goodfellow、Yoshua Bengio与Aaron Courville详解

Deep Learning

tensorflow-deep-learning, google tensorflow深入学习大数据示例.zip

Deep_Learning_in_Python

Manning.Grokking.Deep.Learning_deeplearning_deep_python_

Manning.Deep.Learning.with.Python

R.Deep.Learning.Essentials.1785280589

MIT.Press.Deep.Learning.2016

Packt.Python.Deep.Learning.Cookbook.2017

Deep.Learning.for.Vision

Deep.Learning.with.TensorFlow

最新资源