Hadoop+：异构集群中MapReduce应用的性能建模与评估

156 浏览量更新于2024-08-26 收藏 356KB PDF 举报

"Hadoop +：异构集群中MapReduce应用程序的异构性建模和评估" 这篇研究论文探讨了在异构集群环境下，如何对MapReduce应用进行有效的建模和性能评估，特别是在CPU与GPU混合的异构计算环境中。Hadoop+是作者提出的一个框架，旨在解决大规模MapReduce应用在面对异构硬件资源时的性能优化问题。在当前的数据中心，异构集群已经成为一种常见的架构，因为它能有效地利用不同类型的硬件资源，如CPU和GPU，以提高计算效率和能效。然而，异构集群中的资源分配策略对于MapReduce应用的性能有着显著影响。传统的资源分配方法往往假设所有节点具有相同的性能，这在异构环境中可能导致性能瓶颈和资源浪费。论文指出，当CPU和GPU任务在同一系统中并发运行时，它们会竞争共享资源，如内存带宽和I/O，这可能导致性能下降，而非性能提升。因此，理解和建模这种异构性对于优化MapReduce应用至关重要。作者提出了一种新的模型，该模型能够更好地理解并预测不同任务在异构环境下的执行行为，从而实现更智能的资源调度。 Hadoop+框架包含了两个关键组成部分：一是针对MapReduce任务的异构性建模，它考虑了任务的计算密集度、数据局部性和资源需求等因素；二是性能评估机制，通过模拟和实验证明了模型的有效性。该框架能够动态调整任务分配，避免资源争抢，并根据硬件的特性最大化应用性能。此外，论文还详细讨论了实验设计和结果，通过对比不同的资源分配策略，展示了Hadoop+在提高吞吐量、降低延迟以及优化能源效率方面的优势。这些实验结果对于数据中心管理者和系统开发者来说，提供了有价值的指导，有助于他们在实际部署中做出更好的决策。 "Hadoop +：异构集群中MapReduce应用程序的异构性建模和评估"这篇论文提出了一个创新的解决方案，以应对异构集群中MapReduce应用的性能挑战，通过对异构环境的深入理解和精确建模，实现了更高效的资源管理和性能优化。这对于未来数据中心的设计和MapReduce应用的性能提升具有深远的影响。







































 !

"#

Figure 1: KNN’s data processing speed.

2.2.2 Key Observations

We evaluate 11 computing resource conﬁgurations, as

shown by the horizontal axis in Figure 1. Each conﬁgu-

ration is denoted as gG-cC-tT, where g represents for the

number of simultaneously running GPU map tasks, c for

the number of simultaneously running CPU tasks, and t for

the number of threads inside each CPU map task. Figure 1

shows the performance, with the vertical axis representing

the data processing speed. We make two signiﬁcant obser-

vations from the results:

• Using two GPUs only brings very slight performance

gain over one GPU. When only one GPU is used, the

data processing speed is 60MB/s (the ﬁrst bar in Fig-

ure 1). However, when another GPU is exploited si-

multaneously, the data processing speed is only slightly

increased to 65MB/s (the second bar in Figure 1).

• Coordinating CPUs together with one GPU leads to

worse performance than that of only using one GPU.

When only one GPU is used, the data processing speed

is 60MB/s (the ﬁrst bar). However, When one or more

CPU tasks are running simultaneously with one GPU

task (bars 3-7), the overall performance would decrease

unexpectedly, varying from 58MB/s to 51MB/s. A

similar observation can be found for the conﬁgurations

containing two GPU tasks.

2.2.3 Analysis

First we demonstrate the diﬀerent behaviors of a CPU

task and a GPU task in Hadoop+, as shown in Figure 2. As

the red line shows, the I/O traﬃc of a CPU task keeps almost

unchanged during the task execution. The reason is that

Hadoop+ leverages the execution mechanism in Hadoop for

CPU tasks, which iteratively reads only a small piece of

data and processes them quickly, thus the I/O traﬃc keeps

low. However, the behavior of the GPU task is diﬀerent, as

shown by the blue line. To obtain high GPU occupancy, the

GPU task reads a chunk of data, transfers it to the GPU,

and launches the GPU kernel to process it, thus it exhibits

obvious phase behavior. In particular, the I/O traﬃc is high

when the GPU task is reading data from HDFS (via its host

thread), and low when the GPU task is executing the kernel.

To analyze the reason for the observations in Section 2.2.2,

we take one GPU task, denoted x, and examine its perfor-

mance under the 11 conﬁgurations. We ﬁnd that the key

reason is shared I/O resource contention among CPU and

GPU tasks. To demonstrate this, we comment out the com-

putation in x and run it under the 11 conﬁgurations. In











       









Figure 2: Behaviors of CPU/GPU tasks.









































 !"#$

%&

Figure 3: KNN’s data reading speed.

this way, each GPU task reads a split from HDFS without

any following computations. Figure 3 shows the data read-

ing speed of x. When only x is running, the data reading

speed can reach 72MB/s (the ﬁrst bar 1G-0C-0T), while it

drops to 36MB/s when another GPU task is running simul-

taneously (the second bar 2G-0C-0T). Furthermore, when 4

single-threaded CPU tasks and another GPU task are run-

ning together with x, its data reading speed would decrease

to only 14MB/s (the last bar 2G-4C-1T).

2.2.4 Summary - The Challenge

The observations and our analyses demonstrate that it is a

challenge to model the heterogeneity for MapReduce appli-

cations running in heterogeneous clusters. In particular, the

challenge can be summarized into the following questions:

• What factors would aﬀect the performance gain when

allocating a computing resource to an application?

• How will the performance contribution of one comput-

ing resource, e.g., GPU, vary with applications?

• How to select a resource conﬁguration for an applica-

tion for diﬀerent purposes, e.g., to obtain best perfor-

mance, or to be most cost-eﬀective?

3. HADOOP+ FRAMEWORK

Figure 4 gives an overview of our Hadoop+ framework.

Besides the Map and Reduce primitives in Hadoop, Hadoop+

provides another two primitives, PMap and PReduce, to pro-

grammers. The diﬀerence is that the PMap and PReduce

in Hadoop+ enable programmers to write explicit parallel

CUDA/OpenCL functions running on GPUs as plug-ins, as

shown by the box of “User-Provided PMap/PReduce Func-

tion” in Figure 4. Meanwhile, users can also use the Map

and Reduce functions in Hadoop. In Hadoop+, users can

provide Map, PMap or both, and Reduce, PReduce or both.

To support explicit parallel Map functions, Hadoop+ pro-

vides diﬀerent input parameters for Map and PMap.In

particular, the input of Map is (key,value), while the input

剩余10页未读，继续阅读

weixin_38640168

粉丝: 6
资源: 958

Hadoop+：异构集群中MapReduce应用的性能建模与评估

hadoop+ha+hive+mapreduce

hadoop+hive+mapreduce的java例子

基于Hadoop 的海量网格数据建模

MapReduce计算框架

【深度分析】：探索MapReduce task数目对集群性能的神秘影响

【MapReduce作业调度】：集群利用率最大化，智能调度策略

【Hadoop集群中XML文件的故障排除】：高效步骤与真实案例

Hadoop分块存储：网络传输效率的优化分析

大数据金融处理专家：Python在Hadoop和Spark的应用

并行算法在金融领域的应用：加速风险评估和交易处理（业内实战）

最新资源