ELM：极致学习机的理论与应用

需积分: 9 122 浏览量更新于2024-07-23 收藏 588KB PDF 举报

"ELM, 极速学习" Extreme Learning Machine (ELM) 是一种新兴的机器学习技术，它主要用于神经网络，尤其是单隐藏层前馈网络(SLFNs)的快速训练。ELM 的核心思想是将隐藏层节点的权重随机初始化，然后通过 Moore-Penrose 广义逆来确定输出权值，以求得网络的唯一最优解。这种方法的优势在于它极大地提高了学习速度，同时避免了传统梯度下降法可能遇到的问题，如局部最小值陷阱、大量的训练迭代和学习率的选择困难。传统的神经网络学习通常依赖于反向传播算法，这需要多次迭代调整权重，过程可能耗时且容易陷入局部最优。相反，ELM 在训练过程中不需要对隐藏层节点进行迭代更新，因此能迅速找到全局最优解，具有良好的泛化能力和计算效率。这使得 ELM 成为了处理大规模数据集和实时任务的理想选择。支持向量机(SVM)是另一种广泛使用的机器学习方法，尽管在某些任务上表现优秀，但同样面临训练时间长和参数调整复杂等问题。ELM 的出现提供了一种新的解决方案，它简化了训练流程，减少了人工干预的需求，降低了模型的复杂性。 ELM 的应用范围广泛，包括但不限于模式识别、分类、回归分析、异常检测、信号处理和大数据分析等。其高效的学习机制使其在处理高维度数据和复杂非线性问题时表现出色。此外，ELM 还可以与其他机器学习技术结合，如集成学习，进一步提升预测和识别的精度。在实际应用中，ELM 的优点在于它的简单性和鲁棒性。随机初始化隐藏层权重使得 ELM 对初始条件不敏感，增强了模型的稳定性和适应性。同时，由于训练速度快，ELM 可以轻松应对大型数据集，降低了计算资源的消耗。 ELM 是一种革新性的机器学习方法，它克服了传统神经网络和 SVM 的某些挑战，为快速、高效和准确的机器学习提供了新的途径。随着研究的深入，ELM 的理论和应用将进一步发展和完善，有望在更多领域展现出强大的潜力。

; b

i¼1

randomly generated from any intervals of R



R; according to any continuous probability distribution,

with probability one, H

NL

Lm

 T

Nm

\:

From the interpolation point of view the maximum

number of hidden nodes required is not larger than the

number of training samples. In fact, if L = N, the training

errors can be zero.

Theorem 2.2 [6] Given any activation function g : R !

R which is inﬁnitely differentiable in any interval and

N arbitrary distinct samples ðx

; t

Þ2R

 R

; for any

fða

; b

Þg

i¼1

randomly generated from any intervals of

 R; according to any continuous probability distribu-

tion, with probability one, H

NN

Nm

 T

Nm

¼ 0:

From interpolation point of view, wide type of activa-

tion functions can be used in ELM, which include the

sigmoid functions, the radial basis, sine, cosine, exponen-

tial, and many other non-regular functions [13]. It may be

too strict to request that activation functions of hidden

nodes are inﬁnitely differentiable. For example, it may not

include some important activation functions such as

threshold function: gðxÞ¼1

x 0

þ 0

x\0

: Threshold net-

works are very popular in real applications, especially in

digital hardware implementation. However, as threshold

function is not differentiable, researchers did not manage to

ﬁnd any efﬁcient direct learning algorithms for threshold

networks in the past two decades [27–30]. Interestingly,

from the universal approximation point of view, the above

mentioned interpolation theorem can be extended to almost

any type of nonlinear piecewise continuous function

including the threshold function, and thus an efﬁcient direct

learning algorithm (e.g. ELM) can be applied to those cases

which cannot be handled by other learning techniques in

the past decades.

2.2 Universal approximation theorem

Huang et al. [7] proved in theory that SLFNs with ran-

domly generated additive or RBF nodes can universally

approximate any continuous target functions over any

compact subset X 2 R

: Let L

(X) be a space of functions

f on a compact subset X in the d-dimensional Euclidean

space R

such that jf j

are integrable, that is,

jf ðxÞj

dx\1: Let L

ðR

Þ denoted by L

. For u; v 2

ðXÞ; the inner product hu; vi is deﬁned by

hu; vi¼

uðxÞvðxÞdx ð12Þ

The norm in L

(X) space is denoted as kk; and the

closeness between the network function f

and the target

function f is measured by the L

(X) distance:

 f k¼

ðxÞf ðxÞj

1=2

ð13Þ

Deﬁnition 2.1 (p. 334 of [31]) A function gðxÞ : R ! R is

said to be piecewise continuous if it has only a ﬁnite

number of discontinuities in any interval, and its left and

right limits are deﬁned (not necessarily equal) at each

discontinuity.

Deﬁnition 2.2 A node is called a random node if its

parameters ða; bÞ are randomly generated based on a con-

tinuous sampling distribution probability.

Different from the randomness mentioned in other

learning methods [4, 32, 33], all the hidden node parame-

ters ða

; b

Þ in ELMs can be independent of the training

samples and can be randomly generated before the training

samples observed. (Refer to [34] for the details of the

differences between ELM and Igelnik and Pao [33] and

Lowe et al. [4, 32]).

Deﬁnition 2.3 The function sequence fg

¼ Gða

; b

; xÞg

is said randomly generated if the corresponding parameters

ða

; b

Þ are randomly generated from R

 R or R

 R

based on a continuous sampling distribution probability.

Lemma 2.1 (Proposition 1 of [16]) Given g : R !

R; spanfgða x þbÞ : ða; bÞ2R

 Rg is dense in L

for

every p 2½1; 1Þ; if and only if g is not a polynomial

(almost everywhere).

Lemma 2.2 [17] Let k : R

! R be an integrable boun-

ded function such that k is continuous (almost everywhere)

and

kðxÞdx 6¼ 0: Then spanfkð

xa

Þ : ða; bÞ2R

 R

is dense in L

for every p 2½1; 1Þ:

Lemmas 2.1 and 2.2 show that feedforward neural net-

works with additive or RBF hidden nodes can approximate

any target continuous function provided that the hidden

node parameters ða

; b

Þ are tuned properly and appropriate

values are given. Lemmas 2.1 and 2.2 only show the uni-

versal approximation capability of feedforward neural

networks with additive or RBF hidden nodes, however,

how to ﬁnd the suitable hidden node parameters ða

; b

remains open, and many tuning based learning algorithms

have been suggested in the past. Huang et al. [7] proved

that given any bounded nonconstant piecewise continuous

activation function g : R ! R for additive nodes or inte-

grable piecewise continuous activation function g : R ! R

(and

gðxÞdx 6¼ 0) for RBF nodes, the hidden layer of

such SLFN need not be tuned, in fact, all the hidden nodes

can be randomly generated. SLFNs with randomly gener-

ated hidden nodes can universally approximate any target

functions. Let e

: f - f

denote the residual error function

110 Int. J. Mach. Learn. & Cyber. (2011) 2:107–122

123

剩余15页未读，继续阅读

honghf123

粉丝: 0
资源: 10

ELM：极致学习机的理论与应用

ELM极限学习机 Matlab代码

人工智能-深度学习-基于极速学习机的深度学习在图像分类上的研究.pdf

基于极速学习的粗糙RBF神经网络

基于自适应极速学习机的遥感图像目标识别

基于SVM技术的精简极速学习机 (2014年)

基于代价极速学习机的软件缺陷报告分类方法.pdf

基于多分类投影极速学习机的快速人脸识别方法.pdf

基于ELM-BP的强化学习在倒立摆控制中的研究

邻域粗糙集与极速学习机算法的结合应用

面向大样本的NKELM：快速核化极速学习机

最新资源