异构系统任务级OOO执行框架：自动化并行处理

研究论文

110 浏览量更新于2024-08-26 收藏 810KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"异构系统的任务级别OOO框架旨在解决多核处理器中任务级乱序执行的问题，通过编程模型、OoO任务调度器和处理元素三层架构实现。该框架借鉴了指令级并行性的重命名策略，将其应用于任务级并行性，自动发现并行任务，动态消除任务间的写后写(WAW)和写后读(WAR)依赖，从而提高系统性能。论文采用了Tomasulo算法，将原本用于ILP的技术应用于TLP，以优化任务调度和资源分配。" 这篇研究论文提出了一种针对异构系统（可能包括不同架构的CPU、GPU或其他加速器）的任务级别乱序执行（Out-of-Order, OoO）框架。在当前多核和异构计算环境中，有效地利用并行性是提升系统性能的关键。传统的OoO执行主要关注于指令级并行性（Instruction-Level Parallelism, ILP），而该框架则创新地将这一概念扩展到任务级并行性（Task-Level Parallelism, TLP）。首先，框架的编程模型层允许开发者以一种抽象的方式描述任务及其依赖关系，使得任务并行性能够被系统自动识别。这层设计的目标是简化编程复杂性，让程序员可以更专注于任务逻辑，而不是底层的并行执行细节。其次，OoO任务调度器负责决定任务的执行顺序和资源分配。它利用重命名策略来检测和解决任务间的数据依赖问题，这是从ILP中借鉴的一个关键技术。重命名使得任务可以在没有冲突的情况下并行执行，减少了等待时间和资源浪费。第三，处理元素层是执行任务的实体，可以是处理器核心、GPU单元或其他加速器。这一层的设计考虑了异构硬件的特点，确保任务能在适当的硬件上高效运行。论文中提到，采用了Tomasulo算法来优化任务调度。Tomasulo算法是一种经典的ILP技术，用于处理流线中的数据冲突和资源竞争。在此框架中，Tomasulo算法被改编以适应任务级别的调度，动态地管理资源和消除WAW和WAR依赖，以提高整体系统效率。通过这样的三层结构，该框架旨在实现对异构系统中任务级并行性的高效利用，提高系统的执行效率和响应速度，尤其在处理大量并发任务时，能够展现出优于传统方法的性能。这一研究对于未来多核和异构处理器的软件开发具有重要的指导意义。

资源详情

资源推荐

A TASK-LEVEL OOO FRAMEWORK FOR

HETEROGENEOUS SYSTEMS

Junneng Zhang

, Chao Wang

, Xi Li

∗3

, Peng Chen

, Xiaojing Feng

, Xuehai Zhou

∗6

Suzhou Institute for Advanced Study, University of Science and Technology of China

Suzhou, Jiangsu, China

zjneng@mail.ustc.edu.cn

saintwc@mail.ustc.edu.cn

qwe123@mail.ustc.edu.cn

bangyan@mail.ustc.edu.cn

∗

School of Computer Science, University of Science and Technology of China

Hefei, Anhui, China

llxx@ustc.edu.cn

xhzhou@ustc.edu.cn

Abstract—This paper proposes a framework targeting the

problem of task-level out-of-order (OoO) execution for heteroge-

neous systems. The framework consists of three layers: 1) Pro-

gramming model; 2) OoO task scheduler; 3) Processing Elements.

In order to uncover task-level parallelism automatically, renam-

ing scheme is applied from instruction-level parallelism (ILP) to

task-level parallelism (TLP). With the help of renaming scheme,

inter-task data dependencies can be detected automatically dur-

ing execution, and then task-level WAW and WAR dependencies

can be eliminated dynamically. We applied Tomasulo algorithm

from ILP to perform task-level OoO execution, and implemented

a prototype on a state-of-art reconﬁgurable FPGA platform.

Experimental results show that the framework is efﬁcient for

heterogeneous systems.

I. INTRODUCTION

Task-level parallelism (TLP) has been widely researched at

different levels during past decades, e.g. programming model

[1] [2] [3] [4] [5] [6] [7] [8] [9], compiler and runtime library

[10] [11] [12] [13], and architecture [14] [15] [16] [17].

Traditional programming models for TLP, as OpenMP and

MPI, perform well for regular operations (e.g. loop), but for ir-

regular operations, the results may be unsatisﬁed. At compiler

level, how to detect dependencies statically is still challenging.

Alternatively, using special architecture to support task-level

out-of-order (OoO) execution seems efﬁcient. However, how

to make the architecture ﬂexible to suite for various systems

remains unresolved, especially for heterogeneous systems with

different types of Processing Elements (PEs).

In this paper, we take programming model, compiler and

architecture into consideration, and intend to ﬁnd an efﬁcient

way to uncover TLP for heterogeneous systems. The funda-

mental system is an FPGA based platform, which contains

different types of PEs: one or several general purpose proces-

sor(s) (GPP) and a variety of Intellectual Property (IP) cores.

So far the following features have been completed:

1) On the basis of state-of-art programming paradigms, we

propose a programming model which supports TLP without

explicit tasks scheduling by programmers. The program is

divided into a series of tasks, which stand for functions to

be executed on PEs (e.g. GPP or IP).

2) We have implemented a hardware MP-Tomasulo module

for heterogeneous platforms to support OoO task execution.

MP-Tomasulo module detects task-level data dependencies

and eliminates WAW and WAR dependencies automatically

at runtime using renaming scheme.

The rest of the paper is organized as follows: section II

illustrates the programming model, section III details the OoO

task execution of MP-Tomasulo module, section IV gives

the experiments method and results, section V describes the

related work, and section VI summarizes the paper.

II. PROGRAMMING MODEL LAYER

In order to make TLP efﬁcient for heterogeneous systems,

in this paper we propose a framework that is composed

of three layers. The top layer is the programming model

layer, which provides programmers with interfaces for parallel

programming. The middle layer is the scheduler layer, which

is in charge of task-level OoO scheduling using renaming

scheme. The bottom layer is PEs, which are responsible for

task execution. Throughout this paper, tasks refer to dynamic

instances created when application programming interfaces

are invoked by user applications [18]. Furthermore, tasks are

regarded as functional abstract instructions, and each IP core is

treated as a dedicated functional unit to run a speciﬁc hardware

task.

The programming model is derived from CellSs, which uses

annotations to deﬁne tasks. In our framework, PEs are divided

into two categories: hardware PEs and software PEs. Each

type hardware PEs can only do a speciﬁc kind of task, while

software PEs have the capability to do all kinds of tasks. In our

framework, the interfaces to call hardware PEs are regulated

in runtime libraries. To use software PEs, programmers need

to deﬁne the tasks as in CellSs.

978-1-4673-2845-6/12/$31.00

 2012 IEEE

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38679276

粉丝: 2
资源: 911

异构系统任务级OOO执行框架：自动化并行处理

基于RISC-V的异构系统任务管理机制设计与研究.docx

可组合异构系统时间的虚拟原型框架设计

支持负载均衡的异构系统动态任务调度算法

描述自主异构多机器人系统的框架

什么是异构计算系统,异构计算系统的优势

taskflow:一个轻量级的并行异构任务图计算系统

什么是异构计算机系统，优势是什么

深度学习异构系统通信

请描述同构系统和异构系统在扩容过程中的区别

智能驾驶等异构多核混合关键性系统的任务调度算法的调研与优化

对称arm多处理器软件异构

开放系统操作面临的异构型不包括

帮我写一篇基于多核异构处理器系统带宽调整策略专利

异构并行用到的技术有哪些，列出10条并详细描述

异构算力系统开发的难点

计算机系统的硬件异构性、软件异构性主要表现在哪几个方面？

异构计算芯片的特点分析

异构算力统一标识与服务 pdf

最新资源