深度学习基础：人工智能的基石

需积分: 0 54 浏览量更新于2024-06-18 收藏 22.24MB PDF 举报

"《深度学习》是人工智能领域的一本经典著作，由Ian Goodfellow、Yoshua Bengio和Aaron Courville合著。本书涵盖了深度学习的基础知识和应用数学，包括线性代数、概率论与信息理论等内容，旨在帮助读者理解和实践深度学习技术。" 在深度学习中，基础的数学知识是理解模型工作原理的关键。线性代数作为其中的重要组成部分，书中详细介绍了向量、矩阵和张量等基本概念，以及它们之间的乘法运算。矩阵乘法不仅涉及基本的运算规则，还包括了单位矩阵和逆矩阵的概念，这些对于解决线性方程组和线性变换至关重要。线性相关性和向量空间（span）的概念帮助我们理解数据的结构和维度降低的可能性。书中进一步讨论了矩阵的特殊类型，如对角矩阵、正交矩阵和稀疏矩阵等，这些在实际问题中有着广泛应用。此外，特征分解（Eigendecomposition）和奇异值分解（Singular Value Decomposition, SVD）是两个关键的矩阵分解方法，它们在数据分析和机器学习中用于降维、特征提取和图像处理等领域。伪逆矩阵（Moore-Penrose Pseudoinverse）则解决了矩阵不完全或非满秩时的逆运算问题，而迹操作（Trace Operator）和行列式则用于衡量矩阵的特性，如矩阵的规模和旋转不变性。概率论和信息理论是机器学习中的另一重要基石。为什么我们需要概率？因为它是理解不确定性、建模现实世界现象的基础。随机变量、概率分布以及边缘概率和条件概率的计算，构成了概率论的基本框架。条件概率的链式法则（Chain Rule）使我们能够处理复杂条件下的概率问题。独立性和条件独立性的概念有助于简化模型的构建。期望、方差和协方差是衡量随机变量统计特性的核心工具，它们在估计参数、评估模型性能和做预测时起到关键作用。本书还深入到更复杂的主题，如贝叶斯定理、最大似然估计、信息熵和互信息等，这些都是理解和优化深度学习模型所必需的知识。通过这些内容的学习，读者将具备构建和训练深度神经网络所需的理论基础，并能进一步探索深度学习在计算机视觉、自然语言处理、强化学习等多个领域的前沿应用。

Chapter

tro

duction

tors

long

dreamed

creating

mac

hines

that

think.

This

desire

dates

bac

least

the

time

ancien

Greece.

The

ythical

ﬁgures

Pygmalion,

Daedalus,

and

Hephaestus

all

interpreted

legendary

tors,

and

Galatea,

alos,

and

andora

may

all

regarded

artiﬁcial

life

(

Ovid

and

Martin

2004

Spark

1996

andy

1997

;

When

programmable

computers

ere

ﬁrst

conceiv

ed,

eople

ondered

whether

they

migh

ecome

intelligen

undred

ears

efore

one

was

built

(

elace

1842

oday

artiﬁcial

intel

ligenc

(AI)

thriving

ﬁeld

with

man

practical

applications

and

active

researc

topics.

intelligen

softw

are

automate

routine

lab

or, understand

eech

images, mak

diagnoses

medicine

and

supp

ort

basic

scientiﬁc

research.

the

early

artiﬁcial

telligence,

the

ﬁeld

rapidly

tackled

and

solved

problems

that

are

intellectually

diﬃcult

for

human

eings

but

relativ

ely

straight-

forw

ard

for

computers—problems

that

can

describ

list

formal,

math-

ematical

rules. The

true

challenge

artiﬁcial

intelligence

prov

solving

the

tasks

that

are

easy

for

eople

erform

but

hard

for

eople

describ

formally—problems

that

solve

intuitiv

ely

that

feel

automatic,

recognizing

ords

faces

images.

This

out

solution

these

intuitiv

problems.

This

solution

allow

computers

learn

from

exp

erience

and

understand

the

world

terms

hierarc

concepts,

with

each

concept

deﬁned

terms

its

relation

simpler

concepts.

gathering

knowledge

from

experience,

this

approac

oids

the

need

for

uman

erators

formally

ecify

all

the

knowledge

that

the

computer

needs.

The

hierarc

concepts

allows

the

computer

learn

complicated

concepts

building

them

out

simpler

ones.

draw

graph

showing

how

these

CHAPTER

INTR

ODUCTION

concepts

are

built

top

eac

other,

the

graph

deep,

with

man

lay

ers.

this

reason,

call

this

approach

arning

Man

the

early

successes

took

place

relativ

ely

sterile

and

formal

vironmen

and

did

not

require

computers

kno

wledge

out

the

orld. F

example,

IBM’s

Deep

Blue

chess-pla

ying

system

defeated

orld

hampion

Garry

Kasparo

1997

(

Chess

course

very

simple

Hsu

2002

orld,

con

taining

only

sixt

y-four

cations

and

thirt

y-t

pieces

that

can

mov

only

rigidly

circumscrib

ys.

Devising

successful

chess

strategy

is a

tremendous

accomplishmen

t, but

the

challenge

not

due

the

diﬃculty

describing

the

set

chess

pieces

and

allo

able

mov

the

computer.

Chess

can

completely

describ

very

brief

list

completely

formal

rules,

easily

pro

vided

ahead

time

the

programmer.

Ironically

abstract

and

formal

tasks

that

are

among

the

most

diﬃcult

mental

undertakings

for

uman

eing

are

among

the

easiest

for

computer.

Computers

long

een

able

defeat

even

the

est

human

chess

play

er,

but

are

only

recen

tly

matching

some

the

abilities

erage

human

eings

recognize

jects

eech.

erson’s

everyda

life

requires

immense

amount

kno

wledge

out

the

world.

Much

this

kno

wledge

sub

jectiv

and

intuitiv

and

therefore

diﬃcult

articulate

formal

Computers

need

capture

this

same

kno

wledge

order

eha

telligen

One

the

key

challenges

artiﬁcial

telligence

how

get

this

informal

kno

wledge

into

computer.

Sev

eral

artiﬁcial

telligence

pro

jects

hav

sought

hard-co

knowledge

out

the

worl

formal

languages.

computer

can

reason

out

statements

these

formal

languages

automatically

using

logical

inference

rules.

This

kno

the

know

dge

ase

approac

artiﬁcial

intelligence.

None

these

pro

jects

has

led

jor

success.

One

the

most

famous

such

pro

jects

Cyc

(

Lenat

and

Guha

1989

Cyc

inference

engine

and

database

statements

language

called

CycL.

These

statements

are

tered

staﬀ

human

sup

ervisors.

wieldy

pro

cess.

People

struggle

devise

formal

rules

with

enough

complexity

accurately

describ

the

world.

example,

Cyc

failed

understand

story

out

erson

named

red

shaving

the

morning

(

Its

inference

Linde

1992

engine

detected

inconsistency

the

story: it

knew

that

eople

not

electrical

parts,

but

ecause

red

holding

electric

razor,

elieved

the

tit

“F

redWhileShaving”

contained

electrical

parts.

therefore

ask

whether

red

was

still

erson

while

was

sha

ving.

The

diﬃculties

faced

systems

relying

hard-coded

kno

wledge

suggest

that

systems

need

the

ability

acquire

their

own

kno

wledge,

extracting

patterns

from

data.

This

capabilit

known

machine

arning

The

tro

duction

CHAPTER

INTR

ODUCTION

mac

hine

learning

allo

computers

tackle

problems

inv

olving

knowledge

the

real

orld

and

mak

decisions

that

app

ear

sub

jective.

simple

machine

learning

algorithm

called

gistic

ession

can

determine

whether

recommend

cesarean

deliv

ery

(

Mor-Y

osef

1990

al.

simple

machine

learning

algorithm

called

can

separate

legitimate

e-mail

from

spam

e-mail.

naive

Bayes

The

erformance

these

simple

machine

learning

algorithms

dep

ends

heavily

the

epr

esentation

the

data

they

are

given.

example,

when

logistic

regression

used

recommend

cesarean

deliv

ery

the

system

not

examine

the

patient

directly

Instead,

the

ctor

tells

the

system

several

pieces

relev

information,

suc

the

presence

absence

uterine

scar.

Each

piece

information

included

the

represen

tation

the

patient

known

atur

Logistic

regression

learns

eac

these

features

the

patient

correlates

with

arious

outcomes.

er,

cannot

inﬂuence

the

that

the

features

are

deﬁned

any

. If

logistic

regression

was

given

MRI

scan

the

patient,

rather

than

the

ctor’s

formalized

rep

ort,

would

not

able

mak

useful

predictions.

Individual

pixels

MRI

scan

negligible

correlation

with

complications

that

might

ccur

during

delivery

This

dep

endence

represen

tations

general

phenomenon

that

app

ears

throughout

computer

science

and

even

daily

life.

computer

science,

opera-

tions

suc

searching

collection

data

can

pro

ceed

exp

onentially

faster

the

collection

structured

and

indexed

intelligen

tly

. P

eople

can

easily

erform

arithmetic

Arabic

numerals,

but

ﬁnd

arithmetic

Roman

umerals

time-consuming.

not

surprising

that

the

choice

represen

tation

has

enormous

eﬀect

the

erformance

mac

hine

learning

algorithms.

simple

visual

example,

see

Fig.

1.1

Man

artiﬁcial

intelligence

tasks

can

solv

designing

the

righ

set

features

extract

for

that

task,

then

providing

these

features

simple

machine

learning

algorithm.

example,

useful

feature

for

eak

iden

tiﬁcation

from

sound

estimate

the

size

speaker’s

vocal

tract.

therefore

giv

strong

clue

whether

the

eaker

man,

oman,

hild.

er,

for

man

tasks,

diﬃcult

know

what

features

should

extracted.

example,

supp

ose

that

would

lik

write

program

detect

cars

photographs.

know

that

cars

wheels,

might

use

the

presence

wheel

feature. Unfortunately

diﬃcult

describ

exactly

what

wheel

oks

terms

pixel

alues.

wheel

has

simple

geometric

shap

but

its

image

may

complicated

shadows

falling

the

wheel,

the

sun

glaring

oﬀ

the

metal

parts

the

wheel,

the

fender

the

car

ject

the

foreground

obscuring

part

the

wheel,

and

on.

CHAPTER

INTR

ODUCTION

that

are

directly

observed.

Instead,

they

may

exist

either

unobserv

jects

unobserved

forces

the

ysical

world

that

aﬀect

observ

able

quan

tities.

They

also

exist

constructs

the

uman

mind

that

pro

vide

useful

simplifying

explanations

inferred

causes

the

observ

data.

They

can

thought

concepts

abstractions

that

help

make

sense

the

rich

ariabilit

the

data.

When

analyzing

eech

recording,

the

factors

ariation

include

the

eak

er’s

age,

their

sex,

their

accent

and

the

words

that

they

are

eaking.

When

analyzing

image

car,

the

factors

ariation

include

the

osition

the

car,

its

color,

and

the

angle

and

brightness

the

sun.

jor

source

diﬃcult

many

real-w

orld

artiﬁcial

intelligence

applications

that

many

the

factors

ariation

inﬂuence

ery

single

piece

data

are

able

observe.

The

individual

pixels

image

red

car

migh

ery

black

night.

The

shap

the

car’s

silhouette

dep

ends

the

viewing

angle.

Most

applications

require

the

factors

ariation

and

discard

the

disentangle

ones

that

not

care

about.

course,

can

very

diﬃcult

extract

such

high-level,

abstract

features

from

data.

Man

these

factors

ariation,

such

eak

er’s

accen

can

iden

tiﬁed

only

using

sophisticated,

nearly

human-lev

understanding

the

data.

When

nearly

diﬃcult

obtain

representation

solve

the

original

problem,

representation

learning

not,

ﬁrst

glance,

seem

help

us.

arning

solv

this

central

problem

represen

tation

learning

introduc-

ing

represen

tations

that

are

expressed

terms

other,

simpler

representations.

Deep

learning

allows

the

computer

build

complex

concepts

out

simpler

con-

cepts.

Fig.

shows

how

deep

learning

system

can

represen

the

concept

1.2

image

erson

combining

simpler

concepts,

such

corners

and

contours,

whic

are

turn

deﬁned

terms

edges.

The

quin

tessen

tial

example

deep

learning

del

the

feedforw

ard

deep

net

ork

multilayer

eptr

(MLP).

ultila

erceptron

just

mathe-

matical

function

mapping

some

set

input

alues

output

alues.

The

function

formed

comp

osing

many

simpler

functions.

can

think

each

application

diﬀerent

mathematical

function

pro

viding

new

representation

the

input.

The

idea

learning

the

right

represen

tation

for

the

data

provides

one

ersp

ec-

tiv

deep

learning.

Another

ersp

ective

deep

learning

that

depth

allows

the

computer

learn

ulti-step

computer

program.

Eac

lay

the

represen

tation

can

thought

the

state

the

computer’s

memory

after

executing

another

set

instructions

parallel.

Net

orks

with

greater

depth

can

execute

instructions

sequence.

Sequential

instructions

oﬀer

great

ecause

later

instructions

can

refer

back

the

results

earlier

instructions.

ccording

this

剩余801页未读，继续阅读

屈毅

粉丝: 0
资源: 1

深度学习基础：人工智能的基石

深度学习，人工智能经典资料

深度学习的经典论文

深度学习与人工智能的经典的论文的整理

midi文件作曲机深度学习

AI绘画和chitgpt有什么关联

人工智能在图像领域有哪些应用？

推荐一些优质的以创作AI内容的公众号

ai+时尚:人工智能在时尚&服装行业的应用

AI绘画技术什么是能做到的，什么还做不到，在此之间，作为一名学习计算机科学的人，我该如何实现自己的工作实用价值，具体计划是什么

最新资源