Neurocomputing 261 (2017) 144–152
Contents lists available at ScienceDirect
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Online sequential ELM algorithm with forgetting factor for real
applications
R
Haigang Zhang, Sen Zhang
∗
, Yixin Yin
School of Automation and Electrical Engineering, University of Science and Technology Beijing, No. 30, Xueyuan road, Haidian District, Beijing 10 0 083, China
a r t i c l e i n f o
Article history:
Received 3 May 2016
Revised 27 August 2016
Accepted 22 September 2016
Available online 24 February 2017
Keywords:
Extreme learning machine
Online learning
Forgetting factor
Sequential learning
a b s t r a c t
Sequential learning algorithms are a good choice for learning data one-by-one or chunk-by-chunk. Liang
et al. has proposed OS-ELM algorithm based on the ordinary ELM algorithm, which produces better
generalization performance than other famous sequential learning algorithms. One of the deficiencies of
OS-ELM is that all the observations are weighted equally regardless of the acquisition time. However,
the training data often have timeliness in many real industrial applications. In this paper, we propose
a modified online sequential learning algorithm with the forgetting factor (named WOS-ELM algorithm)
that weights the new observations more. Then a convergence analysis is presented to make sure the
estimation of output weights tend to converge at the exponential speed with the arriving of new
observations. For the determination of the value of forgetting factor, it would change with the forecast
error automatically and get rid of excessive human interference. We employ several applications in the
simulation part including time-series predication, time-variant system identification and the weather
forecast problem. The simulation results show that WOS-ELM is more accurate and robust than other
sequential learning algorithms.
©2017 Elsevier B.V. All rights reserved.
1. Introduction
Extreme learning machine (ELM) proposed by Huang in 2006 is
a fast machine learning algorithm based on the generalized single-
hidden layer feedforward networks (SLFNs) [1] . The key advantages
of ELM compared with other famous neural network algorithms
are that the learning parameters in the neural model are gener-
ated randomly without human tuning or iterative method [2,3] .
The output weights are determined by the method of least square
(LS). Nowadays, it has been widely used in many real applications
including both regression and classification problems [4–7] .
In many real applications, data are obtained one by one or
chunk by chunk. Online sequential machine learning is a model
of induction that learns one instance or some instances at a time
[8,9] . Liang et al. has proposed a fast and accurate online sequen-
tial learning algorithm (OS-ELM) for SLFNs based on ELM network
with additive or radial basis function (RBF) hidden nodes [10] . In
OS-ELM, the newly generated observations can be trained one-by-
one or chunk-by-chunk with fixed or varying data size, while the
output weights will be updated analytically simultaneously. Then
R
This work has been supported by the National Natural Science Foundation of
China (NSFC grant nos. 61333002 , 61673056 , 61673055 and 61671054 ).
∗
Corresponding author.
E-mail address: zhangsen@ustb.edu.cn (S. Zhang).
many modified OS-ELM algorithms have been proposed, such as
EOS-ELM [11] , OS-ELMK [12] , OL-ELM-TV [13] et al. However, the
above listed online sequential learning methods do not take time-
liness aspect of the problem into consideration. Timeliness prob-
lem extensively exists in our daily life, such as weather forecast
and stock forecast [14,15] . With the time passing by, the distribu-
tion of data changes and shows much non-stationary phenomenon.
In such cases, the old data should contribute lesser and lesser so
that the model represents the most recent behavior [16] . Broadly
speaking, in the training of ELM model, we should allocate high
weights for new data and low weights for old ones.
There are many ELM-related online learning algorithms subject
to nonstationary applications. FOS-ELM aims to learn the sequen-
tial data with timeliness, where a removable sliding-window is
employed to limit the active area during the process of data ac-
quisition [17] . Zhou employed the same forgetting mechanism in
the regularized and kernelized ELM algorithms [18] . In addition,
Wang proposed OS-ELMK algorithm, and combined it with a slid-
ing window for nonstationary time series prediction [19] . With the
arriving of new observations, the sliding window would move for-
ward in order to forget the ‘old’ samples. Another strategy to deal
with nonstationary data is based on the introduction of forgetting
factor. From the extreme point of view, the method of sliding win-
dow can be seen as a special case of the method of forgetting fac-
tor. Matias introduced the forgetting factor into OS-ELM algorithm
http://dx.doi.org/10.1016/j.neucom.2016.09.121
0925-2312/© 2017 Elsevier B.V. All rights reserved.