Pin动态二进制 instrumentation 的多线程支持与优化

需积分: 9 47 浏览量更新于2024-09-12 收藏 864KB PDF 举报

"本文主要探讨了PIN工具如何在动态二进制 instrumentation 系统中支持多线程应用程序的实现，并且介绍了PIN工具在处理代码缓存和性能优化方面的方法。PIN是一个由Intel开发的强大的动态分析工具，它允许开发者在运行时对二进制代码进行注入或修改，以进行各种分析任务。尽管大部分关于PIN的文献集中在单线程应用上，但这篇论文关注的是如何有效地支持大规模、多线程的应用场景。论文作者包括Kim Hazelwood、Greg Lueck和Robert Cohn，他们来自University of Virginia和Intel Corporation。论文中指出，虽然实现多线程的基本功能相对简单，但是要构建一个在内存和性能上都能线性扩展的系统则是一项复杂的挑战。PIN的最新版本中采用了一些关键设计决策，如即时编译器（JIT）、模拟器和代码缓存的管理策略，以达到可扩展的性能和内存占用。即时编译器（JIT）是动态二进制 instrumentation 的核心组件，它负责将原始机器码转换为包含分析代码的定制指令。在多线程环境中，JIT需要能够高效地为每个线程生成和管理独立的代码版本，同时避免数据竞争和同步问题。 PIN的模拟器部分则确保在多线程环境下正确执行被修改后的代码。模拟器可能需要处理线程间的交互，例如内存访问和系统调用，以保持程序的正确性。这通常涉及到复杂的同步机制和状态跟踪。代码缓存管理是另一个关键点。在多线程应用中，每个线程可能会有自己的代码缓存副本，这需要有效的空间管理和分配策略，以防止内存消耗过大。PIN通过优化代码缓存的分配和复用，实现了在多个线程之间共享和更新代码的能力，同时降低了内存开销。论文详细阐述了PIN如何通过这些技术来应对多线程带来的挑战，如线程间通信、资源竞争和内存管理，从而提供高性能和低内存占用的解决方案。这些设计决策对于理解PIN工具如何在多线程环境中工作，以及如何优化其他类似的动态二进制 instrumentation 系统具有重要的参考价值。"

Scalable Support for Multithreaded Applications on

Dynamic Binary Instrumentation Systems

Kim Hazelwood

†,‡

Greg Lueck

‡

Robert Cohn

‡

†

University of Virginia

‡

Intel Corporation

www.pintool.org

Abstract

Dynamic binary instrumentation systems are used to inject or mod-

ify arbitrary instructions in existing binary applications; several

such systems have been developed over the past decade. Much of

the literature describing the internal architecture and performance

of these systems has focused on executing single-threaded guest

applications. In this paper, we discuss the speciﬁc design deci-

sions necessary for supporting large, multithreaded applications on

JIT-based dynamic instrumentation systems. While implementing

a working solution for multithreading is straightforward, provid-

ing a system that scales in terms of memory and performance is

much more intricate. We highlight the design decisions in the lat-

est version of the Pin dynamic instrumentation system, including

the just-in-time compiler, the emulator, and the code cache. The

overall design strives to provide scalable performance and memory

footprints on modern applications.

Categories and Subject Descriptors D.3.4 [Programming Lan-

guages]: Code generation, Optimization, Run-time environments

General Terms Languages, Management, Measurement, Perfor-

mance

Keywords scalability, multithreading, memory management, in-

strumentation

1. Introduction

The recent trend toward multicore architectures has led software

developers to focus on ways to leverage multiple processing cores

in their application software. One way to utilize multiple cores is

to develop multithreaded (MT) applications. Despite the fact that

MT programs are ubiquitous, many system designers still evaluate

their systems with small, single-threaded (ST) applications. There

are many factors contributing to the lack of analysis of systems

with MT workloads. Simulation and analysis tools either empha-

size or exclusively support single-threaded applications or are too

slow to execute large MT programs. MT applications are inherently

less deterministic than ST applications, complicating the evaluation

methodology (Pereira et al. 2008). The disconnect between today’s

architectures and the applications supported by today’s tools is par-

ticularly problematic as we move further away from single-core

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior speciﬁc permission and/or a fee.

ISMM’09

June 19-20, 2009, Dublin, Ireland.

 2009 ACM 978-1-60558-347-1/09/06. . . $5.00

machines, and the results acquired from single-threaded applica-

tions become even more irrelevant.

To remedy this disconnect, many simulation and analysis tools

are adding support for multithreaded applications. Developers of

these tools are learning that providing support for multithreaded

applications is often straightforward, but providing robust and/or

scalable support for multithreading tends to be much more of a

challenge (Jaleel et al. 2008). Given recent trends toward dramati-

cally increasing the number of cores in multicore processors, it is

ever more critical for the development of scalable solutions to this

and many other design challenges.

At the same time, dynamic binary instrumentation has emerged

as an invaluable mechanism for analyzing and modifying software,

and even simulating new and existing hardware. Unlike static in-

strumentation systems, dynamic instrumentation systems enable

analysis of all executed instructions including shared libraries, dy-

namically generated code, and perhaps most importantly, applica-

tions for which source code is not available. One such dynamic

binary instrumentation system that is widely used due to its user-

friendly API and robust implementation is the Pin dynamic instru-

mentation system.

In this paper, we present and analyze various aspects of our de-

sign for robust support for multithreaded guest application execu-

tion on the Pin dynamic instrumentation system (Luk et al. 2005).

After providing an overview of Pin in Section 2, we introduce the

basic modiﬁcations necessary for supporting multithreaded appli-

cations in Section 3. Next, we delve into our approaches for sup-

porting signals in Section 4. Section 5 focuses on the code cache

and presents our trace construction policy that balances memory

and performance overheads, and our generational cache ﬂushing

policy that allows us to avoid synchronizing cache ﬂushes across all

threads. Section 6 then evaluates the resulting memory and perfor-

mance scalability of the system. Finally, Section 7 presents related

work and Section 8 concludes.

2. Pin’s High-Level Architecture

Before delving into the design aspects that target Pin’s support for

multithreading, we ﬁrst provide a high-level view of Pin’s internal

architecture. We highlight the features that are discussed in more

detail when we focus multithreading support.

At a very high level, Pin is a tool that allows users to mod-

ify existing binary applications with an easy-to-use, cross-platform

instrumentation API. The user simply writes a short, plug-in C++

program (called a Pintool) that deﬁnes where the new code should

be inserted, what code to insert, and when to notify the user of

various events such as thread creation (i.e. callbacks). The rest is

handled automatically and transparently by Pin, which operates on

IA32, Intel



64, Intel



Itanium, and ARM. Pin operates at run-

time, since it is impossible to ﬁnd and modify all of the instruc-

下载后可阅读完整内容，剩余9页未读，立即下载

WangyueSongshan

粉丝: 11
资源: 2

Pin动态二进制 instrumentation 的多线程支持与优化

tokio-rayon:使用Tokio + Rayon将异步代码与占用大量CPU的线程池混合

pin-2.13-62732.zip

PintOS：改进的多线程功能（具有优先级调度）和PintOS内核中的用户程序执行

WOSA 3.0 CDM CIM PIN文档等

圆钢CT110 HDMI Pin脚测试程序 C++

易语言源码易语言WIFI无线PIN码源码.rar

单片机C语言高级特性：多线程、内存管理和异常处理的深入剖析

Pin Trace的原理及基本思想

利用Pin Trace进行系统集成测试

掌握Intel Pin的定点内存访问技巧

最新资源