优化内存分配器提升应用性能：Vam算法详解

需积分: 9 99 浏览量更新于2024-09-06 收藏 259KB PDF 举报

"电子-ALocality-ImprovingDynamicMemoryAllocator（Vam）是一个2005年的研究论文，由Yi Feng和Emery D. Berger撰写，他们隶属于马萨诸塞大学阿默斯特分校计算机科学系。本文主要探讨了在动态内存分配中，内存管理器对于应用程序性能的重要性，特别是对于堆对象的存储空间局部性的影响。传统的一般目的内存分配器主要关注减少碎片化，但这些设计往往牺牲了对应用本地性的关注。另一方面，许多旨在提高本地性的内存分配器要么侧重于改进自身算法而非应用程序的本地性，要么依赖程序员提供信息或运行时 profiling。Vam是一个高性能的内存分配器，它试图在保持低碎片化的同时，通过透明地优化应用程序的内存使用来提升本地性。 Vam的主要创新点在于其对内存的管理方式。它将堆以页面大小的块进行管理，这意味着内存分配是基于页面级别的。此外，Vam通过去除对象头、采用精细粒度的大小类，并采用区域（area-based）分配策略来提高效率。通过这些方法，Vam能够更有效地利用物理内存，减少因频繁的内存请求导致的上下文切换，从而显著提升数据访问的局部性，这对CPU缓存的命中率和整体程序性能具有积极影响。值得注意的是，Vam并不需要程序员的额外干预，也不依赖于复杂的程序分析，而是通过内置的算法和设计决策来实现内存分配的优化。这种设计使得Vam能够在保持高效性和易用性的同时，为各种应用程序提供潜在的性能提升，尤其是在那些对内存局部性有高要求的应用场景，如图形处理、数据库操作等。总结来说，Vam是一个重要的研究贡献，它展示了如何在现代计算机系统中平衡内存管理的性能和效率，同时提高应用程序的运行速度和响应性。通过深入理解并利用内存分配的内在规律，Vam为我们提供了一个新的视角，展示了如何在硬件和软件层面上协同工作，以优化动态内存管理的效能。"

A Locality-Improving Dynamic Memory Allocator

Yi Feng and Emery D. Berger

Department of Computer Science

University of Massachusetts Amherst

140 Governors Drive

Amherst, MA 01002

{yifeng, emery}@cs.umass.edu

ABSTRACT

Because most application data is dynamically allocated, the mem-

ory manager plays a crucial role in application performance by de-

termining the spatial locality of heap objects. Previous general-

purpose allocators have focused on reducing fragmentation, while

most locality-improving allocators have either focused on improv-

ing the locality of the allocator (not the application) or required in-

formation supplied by the programmer or obtained by proﬁling. We

present a high-performance memory allocator that builds on pre-

vious allocator designs to achieve low fragmentation while trans-

parently improving application locality. Our allocator, called Vam,

improves page-level locality by managing the heap in page-sized

chunks and aggressively giving up free pages to the virtual mem-

ory manager. By eliminating object headers, using ﬁne-grained

size classes, and by allocating objects using a reap-based algorithm,

Vam improves cache-level locality. Over a range of large footprint

benchmarks, Vam improves application performance by an average

of 4%–8% versus the Lea (Linux) and FreeBSD allocators. When

memory is scarce, Vam improves application performance by up to

2X compared to the FreeBSD allocator, and by over 10X compared

to the Lea allocator. We show that synergy between Vam’s layout

algorithms and the Linux swap clustering algorithm increases its

swap prefetchability, further improving its performance when pag-

ing.

1. Introduction

Explicit memory managers have traditionally focused on address-

ing the problem of fragmentation, discontiguous free chunks of

memory. Reducing fragmentation improves space efﬁciency and

understandably has received considerable attention by memory man-

ager designers. For example, the widely-used Lea allocator that

forms the basis of the Linux malloc (DLmalloc) was designed

speciﬁcally for high performance and low fragmentation [15, 16,

19].

This material is based upon work supported by the National Science Foun-

dation under CAREER Award CNS-0347339. Any opinions, ﬁndings, and

conclusions or recommendations expressed in this material are those of the

author(s) and do not necessarily reﬂect the views of the National Science

Foundation.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

Submitted to MSP 2005 Chicago, IL USA

However, the widely-acknowledged increasing latency gap be-

tween the CPU and the various levels of the memory hierarchy

(caches, RAM, and disk) makes improving data locality a ﬁrst-level

concern. For most applications, this means improving the locality

of the heap. While applications typically exhibit temporal locality,

spatial locality is dictated by the memory allocator, which deter-

mines where and how to lay out the application’s dynamic data.

This allocator-controlled locality can have a signiﬁcant impact on

the application’s overall performance.

In this paper, we present a new general-purpose memory allo-

cator called Vam that improves data locality while providing low

fragmentation. Vam improves page-level locality by managing the

heap in page-sized chunks and aggressively giving up free pages to

the virtual memory manager. By eliminating object headers, using

a judicious selection of size classes, and by allocating objects using

a reap-based algorithm [9], Vam improves cache-level locality.

We compare Vam to the low-fragmentation Linux allocator (DL-

malloc) and to the page-level locality-improving FreeBSD alloca-

tor (PHKmalloc) [17], both of which we describe in detail. To our

knowledge, PHKmalloc has not been discussed previously in the

memory management literature. We build on these algorithms, in-

corporating their best features while removing most of their disad-

vantages.

Our experiments on a suite of memory-intensive benchmarks

show that Vam consistently achieves the best performance. Vam

performs on average 8% faster than DLmalloc and 4% faster than

PHKmalloc when there is sufﬁcient physical memory to avoid pag-

ing. When physical memory is scarce, Vam outperforms these al-

locators by over 10X and up to 2X, respectively. We show that part

of this improvement is due to an unintended but fortunate synergy

between Vam and the way Linux manages swap space, which holds

evicted pages on disk. We call this phenomenon swap prefetchabil-

ity and show that it leads to improved performance when paging.

2. Related Work

There has been extensive research on dynamic memory allocation.

In their well-known survey paper, Wilson et al. devote most of

their attention to the question of fragmentation, which they identify

as the most important metric for evaluating memory allocators [24].

Johnstone and Wilson in their subsequent studies evaluate a wide

range of allocation policies using actual C/C++ programs and ar-

gue that fragmentation is near zero, given a good choice of alloca-

tion policy [15, 16]. While they argue that reducing fragmentation

generally improves locality, we show that Vam’s approach is more

effective.

Most previous researchers have attacked the problem of local-

ity in memory allocation either by improving the locality of the

allocator itself or by using extra information such as programmer

hints or proﬁles to guide placement decisions. Grunwald and Zorn

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38744207

粉丝: 344
资源: 2万+

优化内存分配器提升应用性能：Vam算法详解

基于springboot+vue的体育馆管理系统的设计与实现（Java毕业设计，附源码，部署教程）.zip

二叉树的创建，打印，交换左右子树，层次遍历，先中后遍历，计算树的高度和叶子节点个数

鸿蒙操作系统接入智能卡读写器SDK范例

【天线】基于matlab时域差分FDTD方法喇叭天线仿真（绘制电场方向图）【含Matlab源码 9703期】.zip

QT 下拉菜单设置参数 起始端口和结束端口

基于springboot+vue的大学生就业招聘系统的设计与实现（Java毕业设计，附源码，部署教程）.zip

java学生学籍管理系统设计与实现(源代码+论文+开题报告+外文翻译+答辩PPT)

基于HTML、JavaScript、CSS的PublicCMS官网2019版响应式静态化设计源码

【数据驱动】基于matlab系统识别工具箱实时数据驱动控制【含Matlab源码 10938期】.zip

win32汇编环境,怎么进行加法运算的

最新资源

QT 下拉菜单设置参数起始端口和结束端口