深度学习理论解析：最大熵原则与神经网络复杂度

《机器学习理论初探》

需积分: 10 41 浏览量更新于2024-07-15 收藏 667KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源推荐

an additional condition must also be imposed on the input bias,





∂

. (2.8)

This condition is satisﬁed for a feedforward neural network, but need not be satisﬁed for

more general learning systems. After a ﬁnite number of steps t the state vector x(t) may

converge to a ﬁxed state x(t) =

x deﬁned by a ﬁxed point equation

x = f ( ˆw

x + b) . (2.9)

For example, in a deep feedforward neural network with L layers the ﬁxed state would be

reached after L − 1 steps, i.e. x(L − 1) =

x, given that the condition on the input bias (2.8)

is satisﬁed. For more general systems the state may or may not converge to a ﬁxed point

depending on the activation transformation (2.5) and initial conditions (2.4).

The ﬁnal ingredient of a neural septuple is a loss function. In a feedforward neural

network the loss function is usually deﬁned by projecting the ﬁxed state

x to the output

subspace

out

x ∈ V

out

and then by comparing the result with a desired output state

out

∂

∈

out

. For example, one can deﬁne a loss function as a squared error of the output neurons,

∂

(

x, b, ˆw) =



out

x −

out

∂





out

x −

out

∂



(2.10)

= (

x − x

∂

)

out

(

x − x

∂

)

= (

x − x

∂

)

out

(

x − x

∂

) .

Since there is no error on the input neurons (2.7) we can also rewrite it as a squared error

on all boundary (i.e. input and output) neurons

∂

(

x, b, ˆw) =

(

x − x

∂

)

(

out

) (

x − x

∂

) . (2.11)

For this reason, we shall refer to H

∂

as a boundary loss function.

3 Supervised vs. unsupervised

In the pervious section we deﬁned a neural network as a neural septuple



out

, ˆw, b, f , H



where x is a state vector of all (input, output and hidden) neurons,

x is a state of only

input neurons,

out

x is a state of only output neurons, ˆw is a weight matrix between all pairs

of neurons, b is a bias vector for all neurons, f(y) is an activation map and H(x, b, ˆw) is a

loss function. A simple example of a loss function is the boundary loss (2.11) which is known

to work very well in a supervised learning. Unfortunately, the boundary loss cannot be used

in unsupervised systems where the output subspace is empty, V

out

= ∅, and thus the bound-

ary loss is always zero, H = H

∂

= 0.

For this reason, in unsupervised systems (beyond

auto-encoders) we must consider other loss functions which are, perhaps, more general than

the boundary loss.

A key observation is that in equation (2.11) the boundary loss was due to a mismatch

in the output conditions or (together with input conditions) in the boundary conditions, i.e.

In our description an auto-encoder is viewed as a supervised system with periodic boundary conditions,

i.e. the input and output states are set equal to each other.

– 5 –

剩余29页未读，继续阅读

syp_net

粉丝: 158
资源: 1187

深度学习理论解析：最大熵原则与神经网络复杂度

机器学习理论及应用.pdf

基于机器学习的入侵检测技术概述

人工智能和机器学习概述.md

机器学习理论和支持向量机的关系

构造机器学习，设计异常图片处理的机器学习理论与算法

机器学习数学理论 pdf

机器学习理论篇 csdn

机器学习---计算学习理论

详细介绍一下机器学习优化理论

专业是计算机在学习研究过程中用到的机器学习相关理论技术

机器学习理论与技术在大数据应用中的常用方法及其重要性

矩阵理论在机器学习中的应用

机器学习学习笔记.pdf

机器学习面试题pdf

基于机器学习的路面病害检测概述

概率机器学习和机器学习的区别

如何开始学习机器学习

机器学习 西瓜书pdf

大数据平台人工机器学习与自动机器学习的区别

机器学习 试题 pdf

最新资源

机器学习西瓜书pdf

机器学习试题 pdf