September 26, 2014 15:48 1450024
Structurally Enhance d Incr emental Neural Le arning
the Self-Organizing Map (SOM),
16
has been used
to deal with this problem as its topology preserv-
ing property is favorable for VQ to provide the map-
ping with enhanced tolerance to faults and noises.
SOM is trained in an unsupervised learning manner
to produce a low-dimensional space and it can pro-
vide topologically preserved mapping from input to
output space using a neighborhood-based function
between neurons.
17–20
However, the distributions of
data points are often distorted on the map.
For better scaling and visualization, a direct and
faithful display of data structure and data distribu-
tion is highly desirable. An extended SOM called
“Visualization-induced SOM” (ViSOM),
21
has been
proposed to constrain the lateral contraction force
between the neurons in SOM to regularize the inter-
neuron distances according to a scalable parameter,
used to determine and control the resolution of the
map. ViSOM can preserve the data structure and the
topology as faithfully as possible. However, neither
SOM nor ViSOM takes the classification border into
account. To design more efficient and robust learn-
ing algorithms for classification, various incremental
learning algorithms have been proposed and applied
on classification and clustering tasks.
The well-known examples of incremental or com-
petitive neural networks include the “Growing Cell
Structures” (GCS)
22
and the “Growing Neural Gas”
(GNG).
23
It is known that the requirement of a pre-
determined neuron number and a pre-fixed struc-
ture makes SOM unpractical. GCS was developed to
allow projections onto a nonlinear, discretely sam-
pled subspace whose dimensionality has to be cho-
sen as apriori. However, some information on the
topological characteristics of the input data may be
lost in this process by simply considering the rela-
tionship between the inherent data and the target-
ing space. GNG is a further modification to GCS,
in which the dimensionality of topological structure
is not pre-defined but discovered during the training
process. Hamker
24
proposed an extension to GNG
in order to perform some supervised incremental
learning tasks, but it is not suitable for unsuper-
vised learning. In recent years, further variations of
GNG have been proposed to perform unsupervised
clustering tasks, such as SOINN.
8
SOINN works as
an incremental neural learning technique to pro-
cess online dynamic data. It represents a topological
structure of the input distribution. Empirical results
show that SOINN is capable of learning the necessary
number of nodes, using fewer nodes but obtain-
ing better results than GNG. Furthermore, sev-
eral extensions of SOINN have been developed and
achieved improved performance on many databases,
such as adjust SOINN
25
and ESOINN.
26
They have
been used in incremental acquisition of language
for humanoid robot,
27
associative memory,
28
finding
typical vector for k-nearest neighbor classification,
29
and pattern-based incremental reasoning.
30
However, there are still four problems remaining
unsolved with SOINN and its extensions: (1) SOINN
is not stable. The learning results rely heavily on the
order of the input data, that is, the number and the
position of different nodes will be different even if
one repeats the training under the same environment
by only changing the order of the input; (2) SOINN
uses a two-layer network. During online learning, the
user must set clearly the time points to stop the first-
layer’s learning and to begin the second-layer’s learn-
ing; (3) SOINN only preserves topological structures,
not the inter-neuron distances, of the input data on
the output space, which fails to preserve the metric
on the mapped space; (4) SOINN only updates the
nearest node incrementally. It ignores the informa-
tion of other representative nodes such as the sec-
ond nearest node and the farthest node, losing some
important information in the learning process.
Moreover, most of the above methods work
offline. Nowadays, online learning is proven to be
able to fit large, or slowly varying datasets better
as it learns one instance at a time, and thus has
been widely used in image classification tasks due to
its high learning efficiency. Examples are as follows.
Mairal et al.
31
proposed an iterative online code-
book learning algorithm to minimize the expected
cost instead of the empirical cost for infinite size
of training set. Based on the first-order stochastic
gradient descent, the algorithm scaled up gracefully
to large-scale datasets and was suitable for a wide
range of learning problems. To capture salient prop-
erties of images in real time, Zhang et al.
32
proposed
an online sparse learning algorithm that utilized the
reconstruction error to update the current codebook.
The method had greatly sped up the computation,
saving storage space and memory.
In the most recent research,
33
the codebook has
been represented as a graph, in which the nodes are
visual words and the edges describe the relationships
1450024-3
Int. J. Neur. Syst. 2014.24. Downloaded from www.worldscientific.com
by NANJING UNIVERSITY on 12/08/14. For personal use only.