优化NSGA-II驱动的本体对齐技术效率提升策略

125 浏览量更新于2024-08-26 收藏 427KB PDF 举报

本文主要探讨了如何提高基于非支配排序遗传算法II（Non-dominated Sorting Genetic Algorithm II, NSGA-II）的本体对齐技术效率。本体对齐是知识图谱领域中的关键任务，它旨在识别和建立不同本体之间实体之间的对应关系，以促进跨源数据融合和知识共享。然而，由于Ontology Alignment Evaluation Initiative (OAEI) 的研究表明，不同的本体匹配器可能得出不同的正确对应结果，这表明单一匹配器可能存在局限性。在当前的研究背景下，一个挑战是如何有效地选择多个匹配器的对齐结果，并且在应用多组候选对齐时，提高整个过程的效率。为了应对这个挑战，本文提出了一种新的方法，它结合了动态候选对齐选择策略、元模型和NSGA-II的优化能力。动态候选对齐选择机制允许系统根据每一对实体的特性实时调整匹配器的选择，以最大程度地提高正确匹配的可能性。首先，文章构建了一个元模型，用于抽象和概括不同本体之间的共性和差异，这有助于减少匹配空间并降低计算复杂性。然后，通过NSGA-II算法，该模型能够优化候选对齐的多样性与精度之间的权衡，确保在多个匹配器的结果中找到一个既包含高质量对齐又具有广泛代表性的最优解集。 NSGA-II算法在此过程中扮演了关键角色，它的多目标优化特性使得算法能够在寻找全局最优解的同时，兼顾多个性能指标。通过适应性地调整算法参数，例如种群大小、交叉和变异概率等，可以进一步提升对齐效率，尤其是在大规模本体对齐问题中。本文的主要贡献包括： 1. 提出一种动态的候选对齐选择策略，结合了元模型和NSGA-II，以提高本体对齐的准确性和效率。 2. 阐述了如何通过调整NSGA-II的参数来优化对齐过程，适应不同规模和复杂性的本体对齐任务。 3. 提供了实验验证，展示了新方法相对于传统方法在OAEI基准上的改进，证明了其在实际应用中的价值。总结来说，这篇文章为本体对齐领域的效率优化提供了一个创新的解决方案，通过结合多目标优化技术和智能候选对齐策略，有望改善现有工具在处理大规模、多源本体对齐问题上的表现，为知识图谱集成和信息检索带来实质性的提升。

telecommunications [14], the feature selection for facial expression recognition [11], etc. NSGA-II focuses on achieving not only a

good diversity of Pareto optimal solutions but also a close approximation of Pareto optimal fronts. However, during the process of

optimizing the ontology alignment through NSGA-II, a large number of evaluations are needed in order to achieve a suﬃcient good

approximation of the Pareto front. The function evaluations for the problem of optimizing ontology alignment are time and memory

consuming. To be speciﬁcally, the evaluation function call takes on average 17 s on an Intel Core (TM) i7 of 2.93 GHz and 168 GB

memory in one generation. In our work, in order to reduce the number of time and memory consuming evaluations, Metamodel,

which could be understood as surrogate evaluation models that are built using existing information [15], is introduced to

approximate the objective function value using solutions that have already been evaluated during the tuning process through NSGA-

II. As the core technology, Metamodeling approach is helpful for NSGA-II to considerably improve the eﬃciency of solving process

by using large number of precise evaluations.

Nowadays, various metamodeling approaches for screening less promising solutions have been proposed and the most frequently

employed ones are based on artiﬁcial neural networks (ANN) and Gaussian Random Field Model (GRFM). With respect to ANN

[16], multilayer perceptrons [17] or exactly interpolating radial basis function (RBF) networks [18] are used, either in their standard

forms or by incorporating add-on features such as measures for the relative importance of input variables [19]. Regarding GRFM, it

is also used to predict objective function values for new candidate solutions by exploiting information recorded during previous

evaluations. Unlike ANN, GRFM dose not only provide estimations of function values but also conﬁdence intervals for the

predictions. Recent publications show that Metamodel based on GRFM turns out to be quite robust and proved to be successful in

many applications in the past [20–22]. Therefore, in this paper, we use GRFM to accelerate the searching process of NSGA-II.

3. Preliminaries

3.1. Ontology and ontology alignment

There are many deﬁnitions of ontology over years. But the most frequently referenced one was given by Gruber in 1993 which

deﬁned the ontology as an explicit speciﬁcation of a conceptualization. For convenience of the work in this paper, an ontology can be

deﬁned as follows:

An ontology is a triple

CPIA=( , , ,

)

, where:

•

C is the set of classes, i.e. the set of concepts that populate the domain of interest,

•

P is the set of properties, i.e. the set of relations existing between the concepts of domain,

•

I is the set of individuals, i.e. the set of objects of the real world, representing the instances of a concept.

•

A is the set of axioms, i.e. the main building blocks for ﬁxing the semantic interpretation of the concepts and the relations [23].

In particular, individuals or instances are the basic, “ground level” components of an ontology. The individuals in an ontology

may include concrete objects such as people, animals, tables, automobiles, molecules, and planets, as well as abstract individuals

such as numbers and words [23]. In general, classes, properties and individuals are referred to as entities.

Ontologies are seen as the solution to data heterogeneity on the web. However, the existing ontologies could themselves

introduce heterogeneity: given two ontologies, the same entity can be given diﬀerent names or simply be deﬁned in diﬀerent ways,

whereas both ontologies may express the same knowledge but in diﬀerent languages [24]. To solve this problem, a so-called ontology

alignment process is necessary. Formally, an alignment between two ontologies can be deﬁned as follows:

An alignment A between two ontologies is a set of mapping elements. A mapping element is a 4-tuple

ee nr(, ′, ,

)

, where:

•

e and

′

are the entities of the ﬁrst and the second ontology, respectively,

•

n is a conﬁdence measure in some mathematical structure (typically in the

[

0, 1

]

range) holding for the correspondence between

the entities e and

′

•

r is a relation (typically the equivalence) holding between the entities e and

′

The ontology alignment process can be de ﬁned as follows:

The alignment process can be seen as a function ϕ which, from a pair of ontologies O and

O′

to be aligned, a set of parameters p

and a set of resources r, returns a new alignment A

between these ontologies:

AϕOOpr=(,′,,)

The ontology alignment process computes a mapping element by using a similarity measure, which determines the closeness value n

(related to a given relation R) between the entities e and

′

in the range

[

0, 1

]

, where 0 stands for complete inequality and 1 for

complete equality.

Next, we describe a general classiﬁcation of the most used similarity measures.

3.2. Similarity measures

Typically, similarity measures between entities of each ontology could be categorized in syntactic, linguistic, taxonomy-based and

instance-based measures. In the following, we present some common similarity measures belonging to these four categories.

X. Xue, Y. Wang

Data & Knowledge Engineering 108 (2017) 1–14

剩余13页未读，继续阅读

weixin_38670531

粉丝: 5
资源: 951

优化NSGA-II驱动的本体对齐技术效率提升策略

bp-NSGA-II Matlab_nsgabp_NSGA-bp_bp-NSGA-II_BP适应度函数NSGA-II多目标_预测

NSGA-II-matlab.rar_NSGA_多目标运行_收敛性_遗传算法

NSGA-II_nsga-ii车间调度_NSGA-II车间_NSGA车间调度_NSGA-II_NSGA案例_

NSGA-II.zip_NSGA-II_NSGA-II matlab _NSGA-II 中文_matlab NSGA-II_n

NSGA-II.rar_NSGA-II_matlab NSGA-II_nsga ii_nsga matlab_nsga-ii m

NSGA-II.rar_NSGA_NSGA-II_NSGA-Ⅱ_nsga ii_nsga-ii matlab

基于NSGA-II的新型本体匹配技术

NSGA-II.rar_NSGA-II_NSGA-III与NSGA-II_NSGA2的优点_NSGA的优点_nsga2算法优势

NSGA-II.rar_NSGA - II_NSGA-II_NSGA_II_boybpe_fewzop

NSGA-II.zip_NSGA_NSGA-II_nsga-ii matlab _多目标优化

最新资源