telecommunications [14], the feature selection for facial expression recognition [11], etc. NSGA-II focuses on achieving not only a
good diversity of Pareto optimal solutions but also a close approximation of Pareto optimal fronts. However, during the process of
optimizing the ontology alignment through NSGA-II, a large number of evaluations are needed in order to achieve a sufficient good
approximation of the Pareto front. The function evaluations for the problem of optimizing ontology alignment are time and memory
consuming. To be specifically, the evaluation function call takes on average 17 s on an Intel Core (TM) i7 of 2.93 GHz and 168 GB
memory in one generation. In our work, in order to reduce the number of time and memory consuming evaluations, Metamodel,
which could be understood as surrogate evaluation models that are built using existing information [15], is introduced to
approximate the objective function value using solutions that have already been evaluated during the tuning process through NSGA-
II. As the core technology, Metamodeling approach is helpful for NSGA-II to considerably improve the efficiency of solving process
by using large number of precise evaluations.
Nowadays, various metamodeling approaches for screening less promising solutions have been proposed and the most frequently
employed ones are based on artificial neural networks (ANN) and Gaussian Random Field Model (GRFM). With respect to ANN
[16], multilayer perceptrons [17] or exactly interpolating radial basis function (RBF) networks [18] are used, either in their standard
forms or by incorporating add-on features such as measures for the relative importance of input variables [19]. Regarding GRFM, it
is also used to predict objective function values for new candidate solutions by exploiting information recorded during previous
evaluations. Unlike ANN, GRFM dose not only provide estimations of function values but also confidence intervals for the
predictions. Recent publications show that Metamodel based on GRFM turns out to be quite robust and proved to be successful in
many applications in the past [20–22]. Therefore, in this paper, we use GRFM to accelerate the searching process of NSGA-II.
3. Preliminaries
3.1. Ontology and ontology alignment
There are many definitions of ontology over years. But the most frequently referenced one was given by Gruber in 1993 which
defined the ontology as an explicit specification of a conceptualization. For convenience of the work in this paper, an ontology can be
defined as follows:
An ontology is a triple
CPIA=( , , ,
, where:
•
C is the set of classes, i.e. the set of concepts that populate the domain of interest,
•
P is the set of properties, i.e. the set of relations existing between the concepts of domain,
•
I is the set of individuals, i.e. the set of objects of the real world, representing the instances of a concept.
•
A is the set of axioms, i.e. the main building blocks for fixing the semantic interpretation of the concepts and the relations [23].
In particular, individuals or instances are the basic, “ground level” components of an ontology. The individuals in an ontology
may include concrete objects such as people, animals, tables, automobiles, molecules, and planets, as well as abstract individuals
such as numbers and words [23]. In general, classes, properties and individuals are referred to as entities.
Ontologies are seen as the solution to data heterogeneity on the web. However, the existing ontologies could themselves
introduce heterogeneity: given two ontologies, the same entity can be given different names or simply be defined in different ways,
whereas both ontologies may express the same knowledge but in different languages [24]. To solve this problem, a so-called ontology
alignment process is necessary. Formally, an alignment between two ontologies can be defined as follows:
An alignment A between two ontologies is a set of mapping elements. A mapping element is a 4-tuple
ee nr(, ′, ,
, where:
•
e and
e
are the entities of the first and the second ontology, respectively,
•
n is a confidence measure in some mathematical structure (typically in the
0, 1
range) holding for the correspondence between
the entities e and
e
,
•
r is a relation (typically the equivalence) holding between the entities e and
e
.
The ontology alignment process can be de fined as follows:
The alignment process can be seen as a function ϕ which, from a pair of ontologies O and
to be aligned, a set of parameters p
and a set of resources r, returns a new alignment A
N
between these ontologies:
AϕOOpr=(,′,,)
N
The ontology alignment process computes a mapping element by using a similarity measure, which determines the closeness value n
(related to a given relation R) between the entities e and
e
in the range
0, 1
, where 0 stands for complete inequality and 1 for
complete equality.
Next, we describe a general classification of the most used similarity measures.
3.2. Similarity measures
Typically, similarity measures between entities of each ontology could be categorized in syntactic, linguistic, taxonomy-based and
instance-based measures. In the following, we present some common similarity measures belonging to these four categories.
X. Xue, Y. Wang
Data & Knowledge Engineering 108 (2017) 1–14
3