混合模型通用软件线程级推测系统

129 浏览量更新于2024-08-25 收藏 700KB PDF 举报

"Mixed Model Universal Software Thread-Level Speculation (ICCP2013)-计算机科学" 在计算机科学领域，线程级推测（Thread-Level Speculation, TLS）是一种优化技术，旨在提高多核处理器上的并行计算效率。传统的TLS方法通常依赖于硬件支持，但近年来的研究开始探索软件实现的方式，以避免对特殊硬件设计的需求。《Mixed Model Universal Software Thread-Level Speculation》这篇论文由Zhen Cao和Clark Verbrugge发表于ICCP2013，他们来自加拿大蒙特利尔的麦吉尔大学计算机科学学院。论文指出，当前的软件TLS方法主要集中在源代码或虚拟机级别的实现，这使得它们针对特定的语言和运行时环境。然而，这种针对性的实现限制了其通用性，并且大多数软件方法采用简单的线程分叉模型，这在处理如深度优先搜索和分治算法等具有树形递归结构的程序时，无法充分提取并行性。为了克服这些局限性，论文提出了一个混合分叉模型的通用软件-TLS系统（MUTLS）。MUTLS基于LLVM中间表示（IR），这是一种独立于语言和架构的IR，能够支持十多种源语言和多个目标架构。通过应用混合分叉模型，MUTLS可以最大化并行覆盖范围，允许所有三个级别的推测：线程、任务和数据。这种方法能够在不牺牲性能的情况下，有效地处理更复杂的并行结构，尤其对于那些包含深度递归的程序。 MUTLS的关键创新在于它的灵活性和普适性。它不仅适用于单一类型的线程分叉，而是结合了不同类型的分叉策略，如同步、异步或条件分叉，以适应各种程序结构。这种混合模型使得软件TLS能够更好地预测和管理线程间的交互，从而提高并行执行的效率和吞吐量。该研究为提升软件并行执行性能提供了一种新的思路，特别是在面临日益复杂的多核处理器环境和多样化编程语言需求时，MUTLS提供了一个具有广泛适用性的解决方案。通过软件实现TLS，开发者无需依赖特定硬件，就能利用多核处理器的全部潜力，这对于提升软件性能和优化资源利用率具有重要意义。

Mixed Model Universal

Software Thread-Level Speculation

Zhen Cao and Clark Verbrugge

School of Computer Science, McGill University

Montr

eal, Qu

ebec, Canada H3A 0E9

Email: zhen.cao@mail.mcgill.ca, clump@cs.mcgill.ca

Abstract—Software approaches to Thread-Level Speculation

(TLS) have been recently explored, bypassing the need for

specialized hardware designs. These approaches, however, tend

to focus on source or VM-level implementations aimed at spe-

ciﬁc language and runtime environments. In addition, previous

software approaches tend to make use of a simple thread forking

model, reducing their ability to extract substantial parallelism

from tree-form recursion programs such as depth-ﬁrst search and

divide-and-conquer. This paper proposes a Mixed forking model

Universal software-TLS (MUTLS) system to overcome these

limitations. MUTLS is purely based on the LLVM intermediate

representation (IR), a language and architecture independent IR

that supports more than 10 source languages and target archi-

tectures by many projects. MUTLS maximizes parallel coverage

by applying a mixed forking model that allows all threads to

speculate, forming a tree of threads. We evaluate MUTLS using

several C/C++ and Fortran benchmarks on a 64-core machine. On

3 computation intensive applications we achieve speedups of 30 to

50 and 20 to 50 for the C and Fortran versions, respectively. We

also observe speedups of 2 to 7 for memory intensive applications.

Our experiments indicate that a mixed model is preferable for

parallelization of tree-form recursion applications over the simple

forking models used by previous software-TLS approaches. Our

work also demonstrates that actual speedup is achievable on

existing, commodity multi-core processors while maintaining the

ﬂexibility of a highly generic implementation context.

Keywords—Thread-Level Speculation; Parallelization; Forking

Model

I. INTRODUCTION

Thread-level speculation (TLS), or speculative multithread-

ing (SpMT) is a safety-guaranteed approach to automatic or

implicit parallelization. Speculative threads are optimistically

launched at fork points, executing a code sequence from join

points well ahead of their parent thread. Safety is preserved

in this speculative model by buffering reads and writes of

the speculative thread. Once the parent thread reaches the

join point the latter may be joined, committing speculative

writes to main memory and merging its execution state into

the parent thread, provided no read conﬂicts have occurred.

In the presence of conﬂicts the speculative child execution is

discarded or rolled back for re-execution by the parent.

Thread-level speculation has received signiﬁcant attention

in terms of hardware development as a feasible technique for

automatic parallelization [4], [17], [16]. Software-only designs

have been proposed, and have the advantage of applying to

existing, commodity multiprocessors, but the immediacy of

application requires some tradeoff in terms of increased over-

head and compilation complexity, with existing research efforts

based on prototype, language-speciﬁc implementations [12],

[10]. Realistic and convincing evaluation of such designs, how-

ever, requires consideration of a full compiler infrastructure,

one that enables both deep investigation and application to a

variety of compilation contexts.

Fundamentally, TLS approaches differ in terms of forking

models: how they create and manage speculative threads.

Two main forking models exist, in-order, and out-of-order,

and existing software models have been primarily based on

one or the other of these strategies, which allow for good

exploitation of parallelism in loops and deep method calls

respectively, as discussed in section II. These simple forking

models, however, have limitations with respect to the ability

to extract parallelism, and a reliance on pure in-order or pure

out-of-order design limits the amount of parallelism that can

be found in more complex programs, including ones that make

extensive use of tree-form recursion, such as found in depth-

ﬁrst search and divide-and-conquer programs.

In this work we propose the Mixed-model Universal soft-

ware-TLS (MUTLS) system to overcome both limitations of

existing software-TLS approaches. First, MUTLS uses a mixed

forking model to maximize the potential to extract parallelism

in more general classes of programs. Second, MUTLS is

universal in that it is language and architecture neutral. Our

approach is to build a pure software TLS design using the

popular LLVM compiler framework [1]. We integrate our de-

sign into LLVM’s machine and language-agnostic intermediate

representation (IR), enabling generic application of TLS to

arbitrary input and output contexts. This has the advantage of

providing a full and non-trivial compiler context for evaluating

TLS, as well as allowing the full range of source and hardware

pairings enabled by the LLVM framework.

Our design is demonstrated and tested by modifying front-

ends for C/C++ and Fortran to support user-driven speculation.

From this we are able to generate native executables (or

JIT-based execution) for non-trivial benchmarks to evaluate

performance, illustrating the potential of our approach as a

means to explore and compare the use of TLS in different lan-

guage contexts. Software TLS faces signiﬁcant challenges in

terms of balancing overhead concerns with the many possible

design decisions possible in TLS implementation. Our system

simpliﬁes this research exploration by allowing for practical

experimentation within an optimizing compiler context. This

paper has the following speciﬁc contributions.

• We describe MUTLS, the ﬁrst software-TLS implementa-

tion on a source language and target architecture independent

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38625559

粉丝: 2
资源: 949

混合模型通用软件线程级推测系统

Skylake-Client-IBRS

some experts worried that such public speculation might lead people to believe that disagreement about the details.的语法结构

限制spark往hdfs写出数据时，生成_success文件

void print(int b,int k);

high-profile

function [mdata]=nirmaf(data,window)

使用自动修复和重新启动功能说的不够全面，如果加上这些功能参数继续报错，如何自动修复

java 调用ctp交易登录代码

目标检测数据集：瓶子图像缺陷检测数据【VOC标注格式】

毕业设计论文SpringBoot健身馆网站.docx

最新资源