云计算环境中的优化列表调度算法生成工作流执行计划

121 浏览量更新于2024-08-29 收藏 251KB PDF 举报

"本文提出了一种基于改进列表调度算法的工作流执行计划生成方法，旨在提高云环境中的处理器利用率，降低科学工作流的执行成本。该算法融合了列表调度和任务复制的思想，通过优化任务优先级选择合适的父任务进行复制，减少任务间的开销，并在处理器空闲时适当地插入任务，提升处理器利用率。实验结果表明，该算法EPGILS在减少任务完成时间方面是可行且高效的。" 在云计算环境中，工作流执行计划的生成是一个关键问题，它直接影响到科学计算任务的性能和成本效率。传统的调度算法可能无法充分利用云环境的动态性和可扩展性。因此，本文作者提出了一种名为EPGILS（Enhanced Priority-based Genetic List Scheduling，增强型基于优先级的遗传列表调度）的改进算法。首先，EPGILS算法采用了列表调度策略，这是一种常用的作业调度方法，通过根据任务的优先级顺序来决定执行顺序。然而，仅依赖优先级可能会导致处理器资源的不均衡分配。为解决这一问题，算法结合了任务复制的概念，选择具有高优先级的父任务进行复制，以优化任务之间的依赖关系，减少等待时间和通信开销。其次，EPGILS算法还利用了处理器的空闲时间进行任务插入。当处理器出现空闲时段时，算法会智能地插入新的任务，以提高处理器的整体利用率，从而减少整体执行时间。这种策略有助于避免处理器资源的浪费，提高云环境的资源利用率。实验部分，作者对比了EPGILS算法与其他常见调度算法的表现，结果显示EPGILS在减少任务完成时间方面具有显著优势。这表明，该算法能够更好地适应云环境的动态变化，有效地平衡任务执行和资源消耗，对于科学工作流的高效执行具有重要意义。此外，尽管文章并未深入探讨算法的复杂性和可扩展性，但可以推断，EPGILS的设计考虑了云环境的特性，可能具有较好的可伸缩性和适应性，能够在处理大规模工作流任务时保持高效性。这为云环境中的工作流调度提供了一个新的优化方向，对于进一步提高云服务质量和用户满意度具有积极的推动作用。这篇研究论文提出了一个创新的云环境下工作流执行计划生成策略，通过改进列表调度算法和任务复制策略，有效提升了处理器利用率，降低了执行成本，为云服务提供商和科研机构提供了更优的计算资源调度方案。

Workflow Execution Plan Generation in the Cloud Computing Environment Based

on an Improved List Scheduling Algorithm

Xiaoying WANG, Chengshui NIU, Yu-an ZHANG

Department of Computer Science and Technology

Qinghai University

Xining, Qinghai, China, 810016

E-mail: {wxy_cta, csniu, yazhang}@qhu.edu.cn

Lei ZHANG

College of Computer Science

Sichuan University

Chengdu, China, 610064

E-mail: zhanglei@scu.edu.cn

Abstract—Focusing on the higher ratio of processor utilization

and lower execution cost of a scientific workflow in the cloud

environment, an improved list scheduling algorithm was

proposed in this paper. This algorithm combines the ideas of

both list scheduling and task duplication. According to the

priority of the tasks, choosing reasonable parent task to

replicate can help reduce the overhead between tasks. To

properly insert tasks during processor idling time can help to

increase the processor utilization. Based on these, we proposed

an improved strategy to generate the workflow execution plan,

called EPGILS. Experiment results show that the algorithm is

feasible and efficient in reducing the task completion time and

improving the utilization ratio of the processor.

Keywords-scientific workflow; cloud computing environment;

list scheduling; task scheduling; execution plan

I. INTRODUCTION

Scientific workflow (SWF) is a new type of application

developed rapidly in recent years. It can support scientists

and researchers to integrate, construct and cooperate various

distributed data services and software tools, providing a

management platform for complex workflow definition and

execution automation of scientific computations [1].

Compared to traditional Business Workflow (BWF), one of

the most important features of the scientific workflow is

data-oriented. Nowadays, SWF is becoming computation-

intensive and data-intensive [2], since the data are generated

continuously and fast and the relevant computation becomes

complex accordingly. Normal computing environment can

hardly meet the requirement of SWF. Hence, cloud

computing environment provides a new deployment and

execution paradigm for scientific applications, since its

infrastructure is usually comprised of high performance

computing resources and massive storage resources.

In the cloud computing environment, customers can rent

the resources on demand. However, changing the resource

allocation scheme will involve the creation of instances and

data movements, which incurs costs and might have impact

on the workflow execution efficiency and total costs [3].

Thus, it’s very important to design reasonable workflow

execution plans for both users and service providers [4]. In

other words, generating a plan means to map the tasks in the

workflow onto the computing resources appropriately.

Hence, in this paper, we propose a task scheduling

algorithm, named EPGILS (Execution Plan Generation based

on Improved List Scheduling), for workflow execution plan

generation based on the list scheduling algorithm. The

motivation is to utilize the idle time of processors efficiently,

thereby reducing the number of necessary processors

involved in task execution. By EPGILS, the advantages of

list scheduling and task duplication are combined to generate

the execution plan of scientific workflow tasks upon the

homogeneous cloud infrastructure. Experiment results show

that this strategy can effectively enhance the parallelism of

the workflow execution. As a result, not only the total spent

time could be reduced, but also the idle processor time slices

could be utilized sufficiently, and thus the total execution

cost for running these tasks could be lowered.

II. R

ELATED WORK

The key issue of workflow execution plan generation is

how to schedule a task onto a proper computing resource [5].

Most algorithms proposed in prior work are heuristic,

including clustering [6], task duplication [7], list scheduling

[8] and GA (Genetic Algorithm)-based algorithms [9].

Clustering-based algorithm tends to map a cluster to a virtual

machine (VM), but doesn’t consider the task duplication

problem among multiple clusters. Task duplication algorithm

will replicate tasks from one VM to another in order to

reduce the communication overhead of different tasks, but

the target server is usually hard to choose. Genetic algorithm

can lead to close-to-optimal convergence time by selection,

crossover and mutation, but the next task to-be-scheduled

can hardly be pre-determined due to the randomness during

the selection process. Also, the crossover and mutation

process usually consumes much computation time [10]. List

scheduling algorithm determines the priority of each task

before it’s executed so that the entire task queue could be

given before execution. Heuristic algorithms were widely-

used by many researchers since they are often robust and

cost-efficient.

This paper proposes an improved algorithm EPGILS

based on the list scheduling algorithm and the task

duplication based algorithm. In a homogenous cloud

2017 International Conference on Computing Intelligence and Information System

DOI 10.1109/CIIS.2017.59

231

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38708707

粉丝: 5
资源: 899

云计算环境中的优化列表调度算法生成工作流执行计划

Multiple DAGs workflow Scheduling algorithm Based on Reinforcement Learning in Cloud Computing

Critical Path-Based Iterative Heuristic for Workflow Scheduling in Utility and Cloud Computing

Software Architecture for Big Data and the Cloud

qa workflow jenkins

plan group

Your branch is based on 'origin/zw/1.2.6-dropping-probability', but the upstream is gone.

class Pipeline(object):

workflow list

最新资源