使用OpenMP进行并行计算实战指南

需积分: 31 195 浏览量更新于2024-07-29 收藏 3.26MB PDF 举报

"Using OpenMP 并行计算语言原版" OpenMP（Open Multi-Processing）是一种应用编程接口（API），用于在共享内存多处理器系统上进行并行编程。这个标准由一个包括计算机硬件和软件供应商的社区维护，旨在简化多线程编程，使开发者能够利用多个处理器核心来加速计算密集型任务。OpenMP 支持 C、C++ 和 Fortran 等编程语言，并且在科学和工程计算领域广泛应用。 OpenMP 的主要概念包括： 1. **并行区域（Parallel Regions）**：这是 OpenMP 中最基础的构造，通过 `#pragma omp parallel` 指令将一段代码标记为可并行执行的区域。编译器会根据可用的处理器核心自动决定如何分配工作。 2. **线程团队（Thread Teams）**：在并行区域内，所有线程形成一个团队，每个线程都有自己的唯一 ID。默认情况下，线程团队中的工作是通过循环展开来分配的，即循环迭代由不同线程并行处理。 3. **同步原语（Synchronization Primitives）**：OpenMP 提供了多种同步工具，如 `barrier`（屏障）、`mutex`（互斥锁）和 `critical`（临界区），确保线程间的正确同步，防止数据竞争。 4. **工作共享构造（Work-sharing Constructs）**：包括 `for` 循环（`#pragma omp for`）和 `sections`（`#pragma omp sections`），用于将任务分配给线程团队的成员。`for` 循环可以自动进行动态或静态调度，而 `sections` 允许将代码分成多个部分，由不同的线程执行。 5. **并行化函数（Parallelized Functions）**：通过 `#pragma omp parallel for` 或 `#pragma omp parallel sections` 可以并行化函数调用，使得函数内部的代码可以在多个线程中并行执行。 6. **数据亲和性（Data Affinity）**：OpenMP 提供 `default(none)`、`shared`、`private`、`firstprivate`、`lastprivate` 和 ` Reduction` 等数据属性，控制变量在并行区域内的可见性和生命周期，以避免数据一致性问题。 7. **动态调整（Dynamic Adjustments）**：通过 `omp_set_nested` 和 `omp_set_max_active_levels` 等函数，可以动态改变线程的嵌套级别和活动线程的数量，以适应运行时的性能需求。 8. **环境变量（Environment Variables）**：OpenMP 使用环境变量来设置默认行为，例如 `OMP_NUM_THREADS` 用于指定程序启动时的线程数。 OpenMP 的使用可以显著提升计算效率，特别是在科学计算、流体动力学模拟、大规模数据分析等领域的应用。然而，使用 OpenMP 进行并行编程也需要注意线程安全、负载均衡以及避免不必要的通信开销等问题。通过深入理解和熟练运用 OpenMP，开发者可以构建出高效且可扩展的并行应用程序，充分利用现代多核处理器的计算能力。

Foreword

Programming languages evolve just as natural languages do, driven by human de-

sires to express thoughts more cleverly, succinctly, or elegantly than in the past.

A big diﬀerence is the fact that one key receiver of programs is nonhuman. These

nonhumans evolve faster than humans do, helping drive language mutation after

mutation, and—together with the human program writers and readers—naturally

selecting among the mutations.

In the 1970s, vector and parallel computer evolution was on the move. Program-

ming assistance was provided by language extensions—ﬁrst to Fortran and then

to C—in the form of directives and pragmas, respectively. Vendors diﬀerentiated

themselves by providing “better” extensions than did their competitors; and by

the mid-1980s things had gotten out of hand for software vendors. At Kuck and

Associates (KAI), we had the problem of dealing with the whole industry, so Bruce

Leasure and I set out to ﬁx things by forming an industrywide committee, the

Parallel Computing Forum (PCF). PCF struck a nerve and became very active.

In a few years we had a draft standard that we took through ANSI, and after a

few more years it became the ANSI X3.H5 draft. Our stamina gave out before it

became an oﬃcial ANSI standard, but the industry paid attention, and extensions

evolved more uniformly.

This situation lasted for a few years, but the 1980s were a golden era for parallel

architectural evolution, with many people writing parallel programs, so extensions

again diverged, and programming needs grew. KAI took on the challenge of re-

thinking things and deﬁning parallel proﬁling and correctness-checking tools at the

same time, with the goal of innovative software development products. By the

mid-1990s we had made a lot of progress and had discussed it a bit with some

hardware vendors. When SGI bought Cray in April 1996, they had an immediate

directive problem (two distinct extensions) and approached us about working with

them. Together we reﬁned what we had, opened up to the industry, and formed

the Architecture Review Board (ARB). OpenMP was born 18 months later, as the

New York Times reported:

NEW STANDARD FOR PARALLEL PROCESSING WORKSTATIONS

Compaq, Digital, Intel, IBM and Silicon Graphics have agreed to

support OpenMP, a new standard developed by Silicon Graphics and

Kuck & Associates to allow programmers to write a single version

of their software that will run on parallel processor computers

using Unix or Windows NT operating systems. The new standard will

xvi Foreword

hasten the trend in which scientists and engineers choose high-end

workstations rather than supercomputers for complex computational

applications. (NYT 28 Oct. 1997)

OpenMP has been adopted by many software developers in the past decade, but it

has competed with traditional hand threading at the one extreme and MPI at the

other. These alternatives are much lower-level expressions of parallelism: threading

allows more control, MPI more scalability. Both usually require much more initial

eﬀort to think through the details of program control, data decomposition, and

expressing thoughts with assembly-language-style calls. The multicore revolution

now demands simple parallel application development, which OpenMP provides

with language extensions and tools. While OpenMP has limitations rooted in its

technical origins, the ARB continues to drive the standard forward.

The supercomputing needs of the New York Times article have now been largely

replaced by scalable clusters of commodity multicore processors. What was a work-

station is now a desktop or laptop multicore system. The need for eﬀective parallel

software development continues to grow in importance.

This book provides an excellent introduction to parallel programming and Open-

MP. It covers the language, the performance of OpenMP programs (with one hun-

dred pages of details about Fortran and C), common sources of errors, scalability

via nested parallelism and combined OpenMP/MPI programs, OpenMP implemen-

tation issues, and future ideas. Few books cover the topics in this much detail; it

includes the new OpenMP 2.5 speciﬁcation, as well as hints about OpenMP 3.0

discussions and issues.

The book should be welcomed by academia, where there is rising interest in un-

dergraduate parallel programming courses. Today, parallel programming is taught

in most universities, but only as a graduate course. With multicore processors now

used everywhere, introductory courses need to add parallel programming. Because

performance is little discussed in any undergraduate programming courses today,

parallel programming for performance is hard to incorporate. OpenMP helps to

bridge this gap because it can be added simply to sequential programs and comes

with multiple scheduling algorithms that can easily provide an experimental ap-

proach to parallel performance tuning.

OpenMP has some deceptive simplicities, both good and bad. It is easy to start

using, placing substantial burden on the system implementers. In that sense, it puts

oﬀ some experienced users and beginners with preconceived ideas about POSIX or

WinThreads, who decide that parallel programming can’t be that simple and who

want to indicate on which processor each thread is going to run (and other unnec-

Foreword xvii

essary details). OpenMP also allows for very strong correctness checking versus

the correctness of the sequential program to which OpenMP directives are added.

Intel Thread Checker and other tools can dynamically pinpoint, to the line num-

ber, most OpenMP parallel programming bugs. Thus, OpenMP implementations

indeed remove annoying burdens from developers. This book will help educate the

community about such beneﬁts.

On the other hand, the simplicity of getting started with OpenMP can lead

one to believing that any sequential program can be made into a high-performance

parallel program, which is not true. Architectural and program constraints must be

considered in scaling up any parallel program. MPI forces one to think about this

immediately and in that sense is less seductive than OpenMP. However, OpenMP

scalability is being extended with nested parallelism and by Intel’s ClusterOpenMP

with new directives to distinguish shared- and distributed-memory variables. In

the end, a high-performance OpenMP or OpenMP/MPI program may need a lot

of work, but getting started with OpenMP remains quite easy, and this book treats

the intricacies of scaling via nesting and hybrid OpenMP/MPI.

OpenMP is supported by thriving organizations. The ARB membership now in-

cludes most of the world’s leading computer manufacturers and software providers.

The ARB is a technical body that works to deﬁne new features and ﬁx problems.

Any interested programmer can join cOMPunity, a forum of academic and industrial

researchers and developers who help drive the standard forward.

I am pleased that the authors asked me to write this foreword, and I hope that

readers learn to use the full expressibility and power of OpenMP. This book should

provide an excellent introduction to beginners, and the performance section should

help those with some experience who want to push OpenMP to its limits.

David J. Kuck

Intel Fellow, Software and Solutions Group

Director, Parallel and Distributed Solutions

Intel Corporation

Urbana, IL, USA

March 14, 2007

Preface

At Supercomputing 1997, a major conference on High Performance Computing,

Networking, and Storage held in San Jose, California, a group of High Performance

Computing experts from industry and research laboratories used an informal “Birds

of a Feather” session to unveil a new, portable programming interface for shared-

memory parallel computers. They called it OpenMP. The proposers included repre-

sentatives from several hardware companies and from the software house Kuck and

Associates, as well as scientists from the Department of Energy who wanted a way

to write programs that could exploit the parallelism in shared memory machines

provided by several major hardware manufacturers.

This initiative could not have been more timely. A diversity of programming

models for those early shared-memory systems were in use. They were all diﬀerent

enough to inhibit an easy port between them. It was good to end this undesirable

situation and propose a uniﬁed model.

A company was set up to own and maintain the new informal standard. It

was named the OpenMP Architecture Review Board (ARB). Since that time, the

number of vendors involved in the speciﬁcation and maintenance of OpenMP has

steadily grown. There has been increased involvement of application developers,

compiler experts, and language specialists in the ARB too.

The original proposal provided directives, a user-level library, and several environ-

ment variables that could be used to turn Fortran 77 programs into shared-memory

parallel programs with minimal eﬀort. Fairly soon after the ﬁrst release, the speci-

ﬁcation was further developed to enable its use with C/C++ programs and to take

features of Fortran 90 more fully into account. Since then, the bindings for Fortran

and C/C++ have been merged, both for simplicity and to ensure that they are as

similar as possible. Over time, support for OpenMP has been added to more and

more compilers. So we can safely say that today OpenMP provides a compact,

yet ﬂexible shared-memory programming model for Fortran, C, and C++ that is

widely available to application developers.

Many people collaborated in order to produce the ﬁrst speciﬁcation of OpenMP.

Since that time, many more have worked hard in the committees set up by the

ARB to clarify certain features of the language, to consider extensions, and to

make their implementations more compatible with each other. Proposals for a

standard means to support interactions between implementations and external tools

have been intensively debated. Ideas for new features have been implemented in

research prototypes. Other people have put considerable eﬀort into promoting the

use of OpenMP and in teaching novices and experts alike how to utilize its features

to solve a variety of programming needs. One of the authors founded a not-for-

剩余377页未读，继续阅读

1Byte

粉丝: 2
资源: 26

使用OpenMP进行并行计算实战指南

MPI与OpenMP并行程序设计：C语言版,mpi和openmp混合编程,C,C++

MPI和openMP并行计算-冒泡排序

MPI与OpenMP并行计算的实验报告及源程序

在fortran下进行openmp并行计算编程

openmp并行计算圆周率多线程

openmp并行编程求积分

桶排序openmp并行

openmp并行计算二维波动方程C语言代码

mpi与openmp并行程序设计:c语言版 pdf

请用openMP并行编程的方法求pi的近似值

最新资源