量子进化算法优化数据网格副本创建策略

需积分: 0 161 浏览量更新于2024-08-26 收藏 1.14MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本文主要探讨了数据网格（Data Grid）中的副本创建策略，特别是在量子进化算法（Quantum Evolution Algorithm, QEA）的应用背景下。数据网格作为一个大规模分布式数据管理的重要组成部分，副本管理对于提高数据访问速度、保证数据可用性和降低网络带宽需求至关重要。传统上，这些问题可能通过遗传算法（Genetic Algorithm, GA）、蚁群优化（Ant Colony Optimization, ACO）或粒子群优化（Particle Swarm Optimization, PSO）等智能计算方法来解决。然而，本文作者创新地提出了一种基于QEA的全局副本创建策略。在研究过程中，作者首先对现有的副本创建策略进行了深入的分析，然后针对数据网格的特性和挑战，将优化模型划分为创建单数据副本和多数据副本两部分。关键的技术问题包括如何表示副本创建决策、如何评估策略的效果以及如何设定合适的约束条件。在这个过程中，QEA的优势在于其能够通过并行搜索和适应性学习来寻找最优解决方案。作者详细描述了如何将QEA应用于副本创建的算法设计中，这涉及到适应性算子的选择、种群更新机制以及适应度函数的设计。实验中，他们使用OptorSim工具进行了一系列对比实验，结果显示，基于QEA的副本创建策略在减少作业响应时间和网络带宽消耗方面表现优于GA、ACO和PSO。特别是随着工作量的增加，QEA的优势更加明显。为了验证QEA在副本创建策略上的显著效果，非参数统计检验被用来分析实验结果。这一结果进一步证实了QEA在复杂、动态的数据网格环境中的有效性和适应性。本文不仅为数据网格副本管理提供了新颖且高效的解决方案，还展示了量子进化算法在解决大规模分布式计算问题上的潜力。通过与传统优化算法的比较，它展示了QEA在提升数据网格性能方面的优势，对于数据网格技术的发展具有重要的理论和实践价值。

资源详情

资源推荐

Author's personal copy

Compared with other ﬁelds, CIA algorithms have not usually

been applied in DGRM [14,27–29], especially in replica creation

[14]. Among the above algorithms, QEA [30–32] is a novel CIA algo-

rithm developed in recent years, which combines the advantages

of both evolutionary and quantum computing. By adopting qubit

chromosome as a representation, the QEA can represent a linear

superposition of solutions due to its probability characteristics.

Compared with other CIA algorithms such as GA and SA, the QEA

has rapid convergence and good global search compatibility and

is a research hot spot in recent years [31–39]. However, there is

no related literature made for the application of QEA in repilca cre-

ation of data grid.

Based on the QEA, we proposed a method of applying QEA in

data grid replica creation. The optimization model expression is

provided and the procedure of replica creation is proposed.

The remainder of this paper is organized as follows. Section 2

reviews the related work. In Section 3, QEA is addressed. In Sec-

tion 4, a QEA-based replica creation strategy is proposed and dis-

cussed in details. In Section 5, simulation experiments are carried

out to demonstrate the performance of our proposed strategies.

Conclusions and future work are drawn in Section 6.

2. Related works

2.1. Principles of DGRM

Data grid mainly consists of grid nodes and network links,

which can be described as a 2-tuple (V,E), where V is a node set

and E is a link set. By expanding concepts of data grid, DGRM

can be abstracted as a 4-tuple (V, E,R,O), where V and E remain

the same meanings. R represents a replica set. It must be empha-

sized that master data are also treated as replica. O denotes an

operation set. These four elements of DGRM can be further deﬁned

as follows.

V can be abstracted as a 4-tuple (V

), where V

is the set

of its computing elements, V

is the set of storage elements, V

the set of jobs assigned to it, and V

is the set of replicas residing it.

E can be donated by a 3-tuple (V

i, j

), where V

and V

are

two endpoints of edges, and C

i,j

represents the transferring cost

per unit data between them, which affected by network band-

width, disk throughput and so on.

R can be represented by a 3-tuple (logName, phyName,size),

where logName and phyName denote DLN and DPN, respectively,

and size is the amount of data.

O can be described by a 5-tuple (O

creation

, O

location

selection

deletion

, O

consistency

), which deﬁnes all the functions of DGRM in

sequence of replica creation, location, selection, deletion and

consistency.

The objectives of DGRM are to manipulate replicas in R by

the operations deﬁned in O to meet the data accessing require-

ments of jobs assigned to grid nodes in set V as well as reduce

the network trafﬁc over E, shorten the job execution time, im-

prove the data availability and increase the resource utilization.

The practical operation process of DGRM can be depicted by

Fig. 1.

Firstly, the job scheduling module assigns jobs to nodes in V,

then the nodes analyze the requiring replicas of V

itself in R. Sec-

ondly, the nodes check whether those replicas storied at V

, if ex-

ists, local processes are made to ﬁnish jobs, otherwise the

operations in O are used to obtain the best replicas. If necessary,

some new replicas are created by O

selection

. Meanwhile, O

consistency

are called to assure that all the replicas of each data are same ex-

cept that data are read-only. O

location

maps DLN to DPN,whereas

deletion

aims to remove the existing replicas when the available

storage space of V

is inadequate for new replicas.

2.2. Replica creation strategies

Generally, replica creation strategies include two types: static

and dynamic strategies. The static strategies need to obtain some

correlative information in advance and place replicas before jobs

are executed, whereas the dynamic strategies replicate data

according to the information collected at the job execution time.

Compared to the static strategies, the dynamic strategies can ﬂex-

ibly assign replicas during jobs executions and be well adapted to

different environments. However, it can also prolong the job re-

sponse time as replica creation could increase the job waiting time.

Both two strategies are involved in data grid replica creation. And

they are often used simultaneously in the process of present strat-

egies. For this reason, in our later discussion, we will not distin-

guish whether the strategies are static or dynamic. In this paper,

however, the existing replica creation strategies are classiﬁed into

traditional and CIA-based strategies depending on what optimiza-

tion technologies they use.

2.2.1. Traditional replica creation strategies

Many replica creation strategies were proposed at the initial

time when the DGRM was developed.

Early in 2001, Ranganathan and Foster [3] proposed six replica-

tion Strategies. Later in 2003, Bell et al. [4] introduced the eco-

nomic principles into data grid replica creation and put forward

a novel replica creation strategy based on Economic Model (EM)

to reduce the job execution time. Meanwhile, some EM-based rep-

lica creation strategies were implemented in OptorSim [5,6]

simulator.

In 2006, Rahman et al. [7] proposed a static replica placement

algorithm that placed replicas to nodes by optimizing average re-

sponse time and a dynamic replica maintenance algorithm that

reallocated replicas to new nodes if performance reduced over last

k time periods. Tang et al. [8] put forward two replica creation

strategies: Centralized Dynamic Strategy (CDR) and Distributed

Dynamic Strategy (DDR), which can minimize the data access time

and network load in combination with Shortest Turnaround Time

(STT) scheduling algorithms.

In 2008, Wu et al. [9] discussed the problem to choose the rep-

lica placement nodes. Lei et al. [10] analyzed the data availability

under the environment of ﬁle loss and bit loss when the storage

capacity for replicas was constrained. Considering the user re-

source priority and QoS, Lin et al. [11] proposed a novel data grid

replica creation strategy based on priority list which can effectively

balance the replica work load.

In 2010, Sashi and Thanamani [12] proposed a novel dynamic

replica creation strategy based on the popularity of ﬁles. Later in

2011, Mansouri and Dastghaibyfard [13] put forward an improved

layered replica creation strategy, the experiment results showed

the proposed strategy outperformed over current strategies about

14%.

2.2.2. CIA-based replica creation strategies

Compared to the traditional strategies, CIA based replica crea-

tion strategies usually adopt CIA as the main optimization algo-

rithm. Although current research on CIA-based replica creation

strategies is not so much, concerns on data grid replica creation

and other aspects of DGRM are increasing.

In 2009, Naseera and Murthy [14] put forward an agent-based

replica placement algorithm. Agents are deployed at each data

node to determine the candidate node for the placement of replica.

In 2010, Zhang et al. [15] presented a replication approach based

on swarm intelligence, which was an adaptive and decentralized

bottom-to-up method. Their simulation results have shown that

the method performs better than no replication. And it outper-

forms EM for big number of jobs.

86 T. Ma et al. / Knowledge-Based Systems 42 (2013) 85–96

剩余12页未读，继续阅读

weixin_38695471

粉丝: 3
资源: 911

量子进化算法优化数据网格副本创建策略

论文研究-基于访问趋势的热点副本创建策略.pdf

论文研究-基于hybrid拓扑的数据网格副本创建策略.pdf

论文研究-基于量子蚁群算法的网格任务调度研究.pdf

用鸢尾花数据做基于网格CLIQUE算法聚类算法

用鸢尾花数据做基于网格STING算法聚类算法

用鸢尾花数据做基于网格STING算法聚类算法其他代码

用鸢尾花数据做基于网格STING算法聚类算法的代码

用鸢尾花数据做基于网格CLIQUE算法聚类算法代码

基于网格密度的dbscan算法matlab

基于数据场的层次网格聚类算法

基于虚拟网格的点云坡度滤波算法.cpp

网格DOA估计的算法有哪些

基于网格的聚类算法原理及步骤

surfer网格化算法

redis地图聚合数据示例代码,采用网格聚合算法处理经纬度数据实现聚合效果

基于VCGLIB库的三角网格精简算法

网格寻路 有障碍 dijkstra算法

论文研究-基于内容分发的数据网格副本创建策略.pdf

最新资源

网格寻路有障碍 dijkstra算法