优化虚拟化系统中的二维页表遍历加速技术

需积分: 8 114 浏览量更新于2024-08-11 收藏 356KB PDF 举报

"这篇论文深入探讨了加速二维（2D）页表遍历的方法，针对虚拟化系统，尤其是CPU虚拟化中的页表虚拟化问题。作者来自AMD公司的计算解决方案组和高级架构与技术实验室，他们分析了嵌套分页如何减少虚拟化带来的软件内存管理开销，并提出2D页表遍历可以降低对hypervisor干预的需求。然而，这种额外的维度也增加了对架构上所需页表引用的最大数量。" 在虚拟化环境中，CPU虚拟化是关键的技术之一，它允许在一个物理硬件上运行多个独立的操作系统实例，每个实例都拥有自己的虚拟地址空间。为了实现这一点，页表虚拟化是必要的，它使得每个虚拟机（VM）都有自己的页表，用于将虚拟地址映射到物理地址。嵌套分页是一种硬件支持的解决方案，它扩展了传统的单维页表遍历，形成一个2D页表结构。这样， hypervisor不必每次都介入页表管理，提高了性能。 2D页表遍历虽然减少了hypervisor的干预，但也引入了新的挑战。主要问题在于，增加的页表层次可能导致更多的页表条目查找，从而增加了处理器在执行页表遍历时的开销。为了缓解这个问题，该论文详细分析了2D页表遍历的性能瓶颈，并提出了优化策略。其中一个优化方法是利用AMD Opteron处理器的页表走查缓存（Page Walk Cache）。这种缓存设计旨在存储最近使用的页表条目，减少对主内存的访问次数，从而提高页表遍历的速度。通过有效地利用这种硬件特性，可以显著降低2D页表遍历过程中的延迟。此外，论文可能还讨论了其他优化技术，例如预取策略、多级缓存的协调以及更智能的页表管理算法。这些技术的目标都是在保持或提高虚拟化性能的同时，减少额外的硬件资源消耗。这篇论文对于理解虚拟化环境中的内存管理优化具有重要意义，特别是对于那些希望提升虚拟机性能、降低资源消耗的研究人员和系统设计师。通过深入研究和实施文中提出的优化策略，可以有效地减轻2D页表遍历带来的性能损失，进一步推动虚拟化技术的发展。

Accelerating Two-Dimensional Page Walks

for Virtualized Systems

Ravi Bhargava

Computing Solutions Group

Advanced Micro Devices

Austin, TX

ravi.bhargava@amd.com

Benjamin Serebrin

Computing Solutions Group

Advanced Micro Devices

Sunnyvale, CA

b enjamin.serebrin@amd.com

Francesco Spadini

Computing Solutions Group

Advanced Micro Devices

Austin, TX

francesco.spadini@amd.com

Srilatha Manne

Advanced Architecture & Technology Lab

Advanced Micro Devices

Bellevue, WA

srilatha.manne@amd.com

Abstract

Nested paging is a hardware solution for alleviating the software

memory management overhead imposed by system virtualization.

Nested paging complements existing page walk hardware to form

a two-dimensional (2D) pag e walk, which reduces the need for

hypervisor intervention in guest page table management. However,

the extra dimension also increases the maximum number of archi-

tecturally-required page table references.

This paper presents an in-depth examination of the 2D page

table walk overhead and options for decreasing it. These options

include using the AMD Opteron

processor’s page walk cache

to exploit the strong reuse of page entry references. For a mix of

server and SPEC



benchmarks, the presented results show a 15%-

38% improvement in guest performance by extending the existing

page walk cache to also store the nested dimension of the 2D page

walk. Caching nested page table translations and skipping multiple

page entry references produce an additional 3%-7% improvement.

Much of the remaining 2D page walk overhead is due to low-

locality nested page entry references, which result in additional

memory hierarchy misses. By using large pages, the hypervisor can

eliminate many of these long-latency accesses and further improve

the guest performance by 3%-22%.

Categories and Subject Descriptors C.0 [General]: Modeling

of computer architecture; C.4 [Performance of Systems]: Design

studies; D.4.2 [Operating Systems]: Virtual Memory

General Terms Performance, Design, Measurement, Experimen-

tation

Keyw ords Virtualization, TLB, Memory Management, Nested

Paging, Page Walk Caching, Hypervisor, Virtual Machine Monitor,

AMD

1. Introduction

Virtualization allows multiple operating systems to run simulta-

neously on one physical system. These operating systems run as

guests on the virtualized system and have little or no knowledge

that they no longer control the physical system resources. The hy-

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior speciﬁc permission and/or a fee.

ASPLOS’08

March 1–5, 2008, Seattle, Washington, USA

2008ACM978-1-59593-958-6/08/03...$5.00

pervisor is the underlying software that inserts abstractions into a

virtualized system: an operating system (OS) becomes a guest OS,

physical addresses become guest physical addresses, and, in gen-

eral, system elements that the OS presumed were real or physical

are converted into virtualized resources under control or manipula-

tion of the hypervisor [14, 16].

Ideally, a virtualized guest system will have comparable per-

formance to an equivalent native, non-virtualized system. This can

indeed be the case for compute-intensive applications. For exam-

ple, the performance overhead for a virtualized system running

SPECint



2000 benchmarks can be less than 5% because the hy-

pervisor is infrequently invoked [2]. However, as the number of op-

erations requiring hypervisor intervention increases, performance

can degrade substantially. While tolerable in many server consoli-

dation environments, these longer run times are unsatisfactory for

performance-sensitive applications.

Operations intercepted by the hypervisor in a virtualized system

could consume thousands of cycles of overhead to trap the condi-

tion, exit the guest, emulate the operation in the hypervisor, and re-

turn to the guest. These costs lead Adams and Agesen to state that

“reducing the frequency of exits is the most important optimization

for classical [hypervisors]” [2]. More speciﬁcally, one of the pri-

mary sources of virtualization exits is software memory translation

management, which is required to maintain the guest page tables.

AMD has implemented nested paging to greatly reduce the

overhead of hypervisor intervention in memory management [4].

Under nested paging, the guest controls its unmodiﬁed page tables.

However, what the guest considers to be real, or system, physical

addresses are in fact virtualized by the hypervisor. Each guest

physical address in the guest page table is looked up in the nested

pag e tables by hardware to obtain the system physical address. The

end result is a two-dimensional (2D) page walk that translates the

guest virtual address directly to the system physical address.

Although nested paging removes the overhead of hypervisor in-

tervention, it increases the maximum number of page entry ref-

erences architecturally required to generate a system physical ad-

dress. If a guest page walk has n levels and a nested page walk has

m levels, a 2D walk requires nm + n + m page entry references.

For example, a 2D page walk with four-level guest paging and four-

level nested paging has six times more page entry references than a

four-level native page walk. Therefore, the overall performance of

a virtualized system is improved by nested paging when the elimi-

nated hypervisor memory management overhead is greater than the

new 2D page walk overhead.

Translation look-aside buffers (TLBs) can limit the nested pag-

ing overhead by caching the full 2D translation and reducing the

frequency of page walks. For applications with a high TLB hit ratio,

the additional 2D latency will have a negligible impact. However,

下载后可阅读完整内容，剩余9页未读，立即下载

zql2003

粉丝: 0
资源: 18

优化虚拟化系统中的二维页表遍历加速技术

藏经阁-Accelerating Spark-ML with Red.pdf

Qualcomm-accelerating-c-v2x-commercialization.pdf

accelerating-sparsity-ampere-architecture.pdf

港股公司研究-招银国际-华润燃气accelerating new business growth.pdf

CLF-C01-CCP-309Q.AWS-Cloud-Practitioner.pdf

藏经阁-Accelerating Spark-ML with Redis modules.pdf

AIGC- Accelerating Industrial Metaverse Value Creation.pdf

Batch-Normalization-Accelerating-Dng-Internal-Covariate-Shift.md

藏经阁-Accelerating the Big Data and.pdf

藏经阁-Accelerating Innovation with Unified Analytics.pdf

最新资源