多核集群动态负载均衡的自适应层次任务调度策略

研究论文

89 浏览量更新于2024-07-15 收藏 1.43MB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

本文主要探讨了"多核集群的自适应分层任务调度方案"，该研究论文由Yizhuo Wang、Yang Zhang等人撰写，分别来自北京理工大学计算机科学与技术学院和河北科技大学信息科学与工程学院。论文发表于2014年，针对现代多核集群环境中的动态任务调度问题，提出了一个创新的策略——自适应和层次化任务调度（AHS）。在当前的多核并行编程语言和模型中，工作窃取（work-stealing）是一种常见的动态负载平衡机制，特别适用于共享内存系统，因为它能够在本地节点内有效地分配和回收任务。然而，在分布式多核集群环境中，由于节点间通信成本较高，直接应用工作窃取可能效率不高。因此，AHS旨在解决这一挑战，通过智能地结合工作窃取和工作分享（work-sharing）两种策略。工作分享通常涉及将任务分解为更小的部分，并在多个节点上同时执行，这样可以减少跨节点通信的需求，提高整体效率。AHS的关键在于其自适应性，它能够根据系统的实时运行状态动态调整任务的分配策略。当某个节点负载过重时，AHS倾向于采用工作窃取，从空闲或轻载节点获取任务；反之，当节点间负载均衡时，它则偏向于工作分享，鼓励节点间的协作执行。论文深入分析了AHS的设计原理、实现细节以及它如何通过调整任务分层和通信策略来优化性能。通过实验评估，作者展示了AHS在多核集群环境下相对于传统方法的优越性，包括更高的任务完成效率、更低的通信开销以及更好的负载均衡能力。此外，AHS还展示了良好的可扩展性和鲁棒性，能够在复杂的任务分布和变化的系统环境中保持高效运作。 "多核集群的自适应分层任务调度方案"是一项重要的研究，它不仅革新了多核系统中的任务调度策略，而且对提高分布式计算环境下的任务执行效率具有深远影响。该论文为并行计算领域的工程师和研究人员提供了一个实用且灵活的工具，有助于优化大规模并行任务在多核集群上的部署和管理。

资源详情

资源推荐

3. System architecture

Fig. 1 depicts our cluster model, in which a ﬁxed number of multi-core workstations are connected by a high speed net-

work. We assume the communication costs between any two nodes are the same. Thus, the multi-core cluster can be viewed

as a two-level hierarchical system. One is the distributed memory level which consists of the cluster nodes. Another is the

shared memory level which consists of multiple cores within a node. To exploit both inter-node parallelism and intra-node

parallelism on these two levels, we propose a hierarchical task scheduling scheme in which tasks are scheduled with differ-

ent approaches on these two levels in order to achieve dynamic load balancing. Brieﬂy, work-stealing is used for load bal-

ancing inside a node and an adaptive approach that supports both work-sharing and work-stealing is used for load balancing

among the cluster nodes.

Assume all the program and data ﬁles have been deployed on each node in Fig. 1. The user logs on to a node to start the

program. Then this node is viewed as a master node. The master node could be any node in the cluster or a speciﬁc node,

such as a resource manager node.

A global scheduler (GS) works on the master node, which is responsible for inter-node task

scheduling, including the initial partitioning and the redistribution of tasks between worker nodes. The novel techniques used in

the GS distinguish our system from the existing work-stealing systems [14,15,20].

There is no initial partitioning phase in traditional work-stealing schemes. Under traditional work-stealing, one initial

task runs on a processing element (PE), new tasks are spawned continually and stolen by idle PEs during the execution.

In shared memory systems, a spawned task can migrate to an idle thread very quickly. The absence of initial partitioning

just increases a few task migrations and the cost of these migrations is negligible. In distributed memory systems, however,

task migration has much higher overhead which is no longer negligible. Initial partitioning could balance the load statically

before the parallel execution of the tasks and thus reduce the frequency of dynamic task stealing across the nodes. An ideal

partitioning could even make the inter-node task migrations never happen. Therefore, we adopt an initial partitioning phase

in our system. The global scheduler partitions the parallel portions of an application adaptively in this phase, according to the

pattern of task parallelism. The detail of our initial partitioning is described in the next section.

After initial partitioning, the tasks which are ready to run will be scheduled onto the cluster nodes and be executed in

parallel on each multi-core node. In Fig. 1, every node, including the master node, has a local scheduler (LS) which is respon-

sible for intra-node task scheduling and cooperates with the GS. When LS receives a task from GS, a classical work-stealing

scheme is applied on the shared memory multi-core node to split and schedule subtasks.

The LS keeps track of the amount of tasks on the node. According to this value, the LS determines whether an inter-node

task migration is necessary.

(1) If the amount of tasks is greater than a threshold, the LS sends a work-sharing request message to the GS and hopes to

transfer a task in the local task queues to another cluster node which is lightly loaded.

(2) If all the local task queues become empty, the LS sends a work-stealing request message to the GS and hopes to help

other worker nodes by stealing a task from a heavily loaded node.

Whether the LS decides to push a task to or steal a task from another node, the target node (victim) needs to be deter-

mined before the task migration. As mentioned in the above section, random victim selection is optimal for shared memory

systems, but it is inefﬁcient when applied to distributed memory systems. In our system, the victim is not randomly selected

by the LS, but determined by the GS. As shown in Fig. 1, when the GS receives a work-sharing or work-stealing request, it

selects the most lightly loaded node as the victim for work-sharing request or the busiest node as the victim for work-

stealing request. Then the GS notiﬁes the victim to migrate a task between it and the requester. Thus, the task migration

among the cluster nodes is centralized controlled by the GS. To support such centralized control, the GS needs the real-time

information of tasks on all the nodes. The information includes task sizes and task migration costs. However, such informa-

tion is neither cheap nor easy to get. A simpliﬁed approach is using the number of tasks to measure the workload on a node.

In our implementation, the GS maintains a task counter for each worker node. Each node periodically updates the task coun-

ter with the number of tasks existing in the local task queues of the node, by sending a message to the GS. The task counters

are used for (1) determining whether a work-sharing or work-stealing operation should be conducted or not, (2) selecting

the victim, and (3) detecting the global termination. The details of the implementation can be found in Section 5.

The use of GS does not mean that the system scalability is limited. On the contrary, our GS/LS design is suitable for build-

ing a scalable system. The reasons are as follows. First, tasks are not transferred through the GS, but transferred between two

LSs directly. Only a few short messages are exchanged between GS and LS (see the implementation section). Second, as the

number of the cluster nodes increases, we can deploy multiple GSs on the system to construct a hierarchical architecture to

achieve scalability. Each GS controls a limited number of LSs and the GSs communicate with each other or with an upper

level scheduler. Moreover, the multi-level schedulers can be adapted to the topology of the system architecture to improve

data locality.

In this case, our framework can be improved by utilizing the information of available resources and real-time loads, which is obtained from the resource

manager.

614 Y. Wang et al. / Parallel Computing 40 (2014) 611–627

剩余16页未读，继续阅读

weixin_38698403

粉丝: 8
资源: 920

多核集群动态负载均衡的自适应层次任务调度策略

多核系统的实时任务调度算法研究

一种面向多核系统的Linux任务调度算法.pdf

autosar多核任务调度

在多模态融合阶段，多核自适应加权的方法与门控的方法达到的效果一样吗？详细说明每种方法的优势与劣势

智能驾驶等异构多核混合关键性系统的任务调度算法的调研与优化

Linux多核多处理器的调度机制

多核聚类中局部自适应的核融合方法

任务调度 matlab

详细解释多核聚类中局部自适应的核融合方法

多核处理器与单核处理器 软件设计有何不同

perf查看CPU多核轮询调度

英飞凌tc387多核使用

autosar多核操作系统及其应用 朱元

SimSo模拟器教程

如何查看CPU多核轮询调度

autosar多核操作系统

多核系统中调度器在哪里执行代码

MCU多核的工作原理

autosar cp 多核启动

TC397多核的工作原理

最新资源

多核处理器与单核处理器软件设计有何不同

autosar多核操作系统及其应用朱元