MapCG: CPU与GPU源代码级兼容性解决方案

125 浏览量更新于2024-07-15 收藏 2.57MB PDF 举报

随着图形处理单元（Graphics Processing Units, GPU）在通用计算市场的显著增长，如何实现CPU和GPU之间的源代码级可移植性已经成为一个重要课题。这篇2012年发表在《计算机科学技术》杂志上的研究论文《利用MapCG提供CPU与GPU的源代码级可移植性》由洪春涛、陈德颢等人探讨了一种创新的方法——MapCG，以解决这一问题。 MapCG的主要目标是为程序员提供一个桥梁，让他们能在编写高性能应用时，同时保持代码对CPU和GPU的兼容性。传统的编程方式通常依赖于特定的GPU API，如CUDA，这些API专为GPU设计，使得编写针对GPU的低级别代码成为必要。然而，这种方式导致了代码的平台依赖性，不利于跨平台应用的开发和维护。论文的核心贡献是MapCG框架，它允许开发者以标准的CPU语言（如C++）编写代码，然后通过MapCG工具自动转换和优化为GPU版本。MapCG通过智能分析和映射，将CPU代码中的数据并行和计算任务映射到GPU的并行架构上，实现了在不修改原有代码结构的情况下提升GPU性能。作者们首先介绍了MapCG的工作原理，包括其基于编译器的技术路线，以及如何解析、分析和重写CPU代码以适应GPU的执行模式。他们强调了MapCG在保留代码可读性和可维护性的同时，如何通过优化数据访问和控制流来提高性能。论文还详细讨论了MapCG在实际项目中的应用案例，展示了它在大规模并行计算、科学计算和图形处理等领域的潜力。实验结果表明，使用MapCG编写的代码能够在保持相似性能的前提下，轻松地在CPU和GPU之间切换，提高了开发者的生产力和代码复用性。此外，文中也探讨了MapCG面临的挑战，如代码优化的复杂性、跨平台兼容性和GPU硬件更新带来的兼容性问题，以及如何通过持续的研究和改进来克服这些问题。这篇论文提出了MapCG作为一种创新的解决方案，它简化了CPU与GPU间的编程接口，促进了高性能计算的代码复用和平台独立性，对于推动GPU编程的普及和应用有着重要意义。对于任何关注GPU编程和高性能计算的开发者来说，理解MapCG的工作原理和优势，无疑有助于他们在实践中更好地利用这两种强大的计算资源。

44 J. Comput. Sci. & Technol., Jan. 2012, Vol.27, No.1

usually create a small number of threads, close to the

number of processors, while GPU programs usually cre-

ate thousands, or even millions of threads. As a re-

sult, the granularity of parallelism is diﬀerent between

a CPU and GPU. In CPU programs, jobs are divided

into small number of sub-tasks, each assigned to a CPU

thread. GPU programs, on the other hand, divide jobs

into a large number of ﬁne-grained sub-tasks, each as-

signed to a GPU thread.

The other obstacle is that CPU and GPU have

diﬀerent ISA. In the early days of GPU computing,

the pioneer programmers used OpenGL to program

GPUs. With time the idea of the general purpose GPU

(GPGPU) has received much more attention, which has

led to changes in both the GPU hardware and pro-

gramming languages. Many features that were once

CPU-only are now used in GPU designs, such as dou-

ble precision arithmetic operations, atomic operations,

and cache. However programmers are still required to

write diﬀerent code for the CPU and GPU. Code writ-

ten for the CPU cannot be run directly on the GPU,

and vice versa.

To solve the ﬁrst problem, we provide a

MapReduce

[10]

based programming API. The MapRe-

duce based API enables programmers to specify the

parallelism without needing to know the underlying

threading model. We will discuss the programming

API in detail in Subsection 2.2. The second problem

is solved by deﬁning the programming language. In

MapCG, programmers write code in a deﬁned subset of

the C++ language, which is then translated into corre-

sponding CPU/GPU code with a source code transla-

tion tool. This is discussed in detail in Subsection 2.3.

Fig.1 shows an overview of the MapCG framework.

Fig.1. Overview of the MapCG framework.

The MapCG API provides a MapReduce-style para-

llel programming model. Programmers express the

parallelism of the application in the MapReduce model,

and write the map() and reduce() functions in a de-

ﬁned language, which provides a uniﬁed view of the

ISA of CPU and GPU. The framework generates CPU

and GPU versions of the Map and Reduce functions by

source code translation, and then uses the MapCG run-

time library to execute them on CPU and GPU respec-

tively. The MapCG runtime is responsible for executing

MapReduce code eﬃciently on diﬀerent architectures.

It also ﬁlls the gaps between architecture capabilities.

For example, it provides the string library on the GPU,

which enables the programmer to call the string library

in the Map and Reduce functions. We also allow co-

processing of CPU and GPU, i.e., using CPU and GPU

together to ﬁnish the map() and reduce() tasks. This

is accomplished by dispatching the map() and reduce()

tasks to the CPU and GPU simultaneously.

At present, the MapCG runtime on CPU is imple-

mented using C++ and OpenMP. The runtime on GPU

is implemented using CUDA, the most commonly used

GPU programming mo del. OpenCL is another option

for implementing the MapCG runtime on GPU. We

leave this for the future work.

2.2 Programming Interface

The MapCG API is based on the MapReduce

[10]

programming model. The MapReduce programming

model originated from functional languages, and was

proposed as a parallel programming model by Google.

Applications written in MapReduce execute in a map-

reduction manner, which is very straightforward. Pro-

grammers simply specify how to split the input data

and how to do the mapping and reduction. The run-

time environment schedules the map() and reduce()

tasks to parallel threads. It also takes care of com-

munication and fault tolerance. Due to its simplicity,

the MapReduce programming model has attained great

popularity since its creation. It has been used widely

in data mining

[11]

, machine learning

[12]

and many other

ﬁelds

[13-14]

The execution of a MapReduce program can be sim-

ply described as follows.

1) The MapReduce framework divides the input

data into pieces using a programmer speciﬁed split()

function.

2) Each piece of the input data is then passed as

input to an instance of the map() function, forming

a map task. Map tasks execute in parallel in diﬀer-

ent threads/processes, each processing its own data.

While processing, map tasks may produce (key, value)

pairs, which are called intermediate pairs. Map tasks

notify the MapReduce framework about the pairs by

calling the emit intermediate() function deﬁned by the

剩余14页未读，继续阅读

weixin_38677806

粉丝: 5
资源: 938

MapCG: CPU与GPU源代码级兼容性解决方案

TeeChart2013_130818_SourceCode

Matters Computational ideas, algorithms, source code

TeeChart2013_131216_SourceCode

This is the main source code repository for Rust. It contains th

Transitions and critical events in the family life cycle: Implications for providing support to families of children with disabilities

Practical API Architecture and Development with Azure and AWS

Advantages of Source Code Compilation and Construction Brought by Tsinghua Mirror Site Address

【Practical Exercise】Ant Colony Algorithm for 3D Path Planning MATLAB Source Code

Practical Experience with Tsinghua Mirror Source Address and System Performance Optimization

Refactoring Conditional Code in MATLAB: Enhancing Readability and Maintainability of Conditional ...

最新资源