全系统二进制翻译中的后台优化技术

二进制翻译

需积分: 9 183 浏览量更新于2024-09-03 收藏 212KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

“Background Optimization in Full System Binary Translation”是一篇探讨全系统二进制翻译中后台优化技术的学术论文，由Roman A. Sokolov和Alexander V. Ermolovich撰写，他们分别来自MCST CJSC和Intel CJSC。这篇论文旨在解决由于不同处理器指令集架构导致的兼容性问题，并通过动态优化提高系统性能。二进制翻译是一种关键技术，用于在不同的处理器架构之间实现二进制代码的兼容性。它允许旧的（legacy）软件在新的、具有前瞻性的处理器架构上运行，而无需重新编译。然而，这种技术的挑战在于，动态优化虽然对提升系统性能至关重要，但同时也可能带来显著的开销，包括CPU周期和整体系统延迟，特别是在优化时间包含在应用程序执行时间内的情况下。为了解决这个问题，论文提出了一种后台优化的策略，即在独立的线程中同时进行优化和执行。这种技术的实施涉及在二进制翻译系统中构建多线程执行的基础设施。通过这种方式，动态优化过程可以在一个单独的线程中运行，从而避免了对正在翻译的应用程序执行的影响，减少了优化过程带来的延迟和CPU资源占用。后台优化的实现细节可能包括以下几个方面： 1. **多线程架构**：在二进制翻译系统中引入多线程机制，使得执行线程和优化线程可以并行工作。 2. **优化调度**：设计智能的优化调度策略，确保优化线程能够在不影响应用执行的同时，有效地进行优化操作。 3. **同步与通信机制**：在执行线程和优化线程之间建立有效的同步和通信机制，防止数据不一致和竞态条件。 4. **性能监控与调整**：持续监控后台优化的效果，根据系统性能反馈动态调整优化策略。 5. **优化选择**：针对不同类型的代码和应用场景，选择最合适的优化操作，如循环展开、指令融合等。通过这样的后台优化，全系统二进制翻译不仅能够保持良好的兼容性，还能在性能上得到提升，降低运行时的开销，提高用户体验。这项技术对于那些需要在多种硬件平台上运行的软件，尤其是那些无法或难以重新编译的遗留应用，具有重要的实用价值。

资源详情

资源推荐

Background Optimization in Full System Binary

Translation

Roman A. Sokolov

MCST CJSC

Moscow, Russia

Email: roman.a.sokolov@gmail.com

Alexander V. Ermolovich

Intel CJSC

Moscow, Russia

Email: karbo@pvk13.org

Abstract—Binary translation and dynamic optimization are

widely used to provide compatibility between legacy and promis-

ing upcoming architectures on the level of executable binary

codes. Dynamic optimization is one of the key contributors to

dynamic binary translation system performance. At the same

time it can be a major source of overhead, both in terms of

CPU cycles and whole system latency, as long as optimization

time is included in the execution time of the application under

translation. One of the solutions that allow to eliminate dynamic

optimization overhead is to perform optimization simultaneously

with the execution, in a separate thread. In the paper we present

implementation of this technique in full system dynamic binary

translator. For this purpose, an infrastructure for multithreaded

execution was implemented in binary translation system. This

allowed running dynamic optimization in a separate thread

independently of and concurrently with the main thread of

execution of binary codes under translation. Depending on the

computational resources available, this is achieved whether by

interleaving the two threads on a single processor core or by

moving optimization thread to an underutilized processor core.

In the ﬁrst case the latency introduced to the system by a

computational intensive dynamic optimization is reduced. In the

second case overlapping of execution and optimization threads

also results in elimination of optimization time from the total

execution time of original binary codes.

I. INTRODUCTION

Technologies of binary translation and dynamic optimiza-

tion are widely used in modern software and hardware com-

puting systems [1]. In particular, dynamic binary translation

systems (DBTS) comprising the two serve as a solution to

provide compatibility between widely used legacy and promis-

ing upcoming architectures on the level of executable binary

codes. In the context of binary translation these architectures

are usually referred to as source and target, correspondingly.

DBTSs execute binary codes of source architecture on

top of instruction set (ISA) incompatible target architecture

hardware. They perform translation of executable codes incre-

mentally (as opposed to whole application static compilation)

interleaving it with execution of generated translated codes.

One of the key requirements that every DBTS has to meet

is that the performance of execution of source codes through

binary translation is to be comparable or even outperform the

performance of native execution (when executing them on top

of source architecture hardware).

Optimizing translator is usually employed to achieve higher

DBTS performance. It allows to generate highly efﬁcient target

architecture codes fully utilizing all architectural features

introduced to support binary translation. Besides, dynamic

optimization can beneﬁt from utilization of actual information

about executables behavior which static compilers usually

don’t possess.

At the same time dynamic optimization can imply sig-

niﬁcant overhead as long as optimization time is included

in the execution time of application under translation. Total

optimization time can be signiﬁcant but will not necessarily

be compensated by the translated codes speed-up if application

run time is too short.

Also, the operation of optimizing translator can worsen the

latency (i.e., increase pause time) of interactive application or

operating system under translation. By latency is meant the

time of response of computer system to external events such

as asynchronous hardware interrupts from attached I/O devices

and interfaces. This characteristic of a computer system is as

important for the end user, operation of hardware attached or

other computers across network as its overall performance.

Full system dynamic binary translators have to provide low

latency of operation as well. Binary translation systems of

this class target to implement all the semantics and behavior

model of source architecture and execute the entire hierar-

chy of system-level and application-level software including

BIOS and operating systems. They exclusively control all the

computer system hardware and operation. Throughout this

paper we will also refer this type of binary translation systems

as virtual machine level (or VM-level) binary translators (as

opposed to application-level binary translators).

One recognized technique to reduce dynamic optimization

overhead is to perform optimization simultaneously (con-

currently) with the execution of original binary codes by

utilizing unemployed computational resources or free cycles.

It was utilized in a number of dynamic binary translation and

optimization systems [2], [3], [4], [5], [6], [7], [8]. We will

refer this method as background optimization (as opposed to

consequent optimization, when optimizing translation inter-

rupts execution and utilizes processor time exclusively unless

it completes).

The paper describes implementation of background opti-

mization in a VM-level dynamic binary translation system.

This is achieved by separating of optimizing translation from

execution ﬂow into an independent thread which can then con-

下载后可阅读完整内容，剩余5页未读，立即下载

blackcat242

粉丝: 21
资源: 2

全系统二进制翻译中的后台优化技术

Topology Optimization in Structural and Continuum Mechanics.pdf

(2013)Monotonic Optimization in Communication and Networking Systems.pdf

combinatorial optimization: algorithms pan.baidu.com

怎么安装optimization toolbox

linux performance tuning pdf

弹簧优化设计MATLAB,基于MATLAB的弹簧优化设计.pdf

2202.09212.pdf

Describe the background information of Significance of analyzing metal-transfer images for quality control and process optimization in detail

如何下载Optimization Toolbox工具包

File "src\gurobipy\var.pxi", line 125, in gurobipy.Var.__getattr__ File "src\gurobipy\var.pxi", line 153, in gurobipy.Var.getAttr File "src\gurobipy\attrutil.pxi", line 100, in gurobipy.__getattr AttributeError: Unable to retrieve attribute 'x'. Did y

Model Predict Control

None of the MLIR Optimization Passes are enabled (registered 2)

混合WOA-PSO优化的rbf神经网络自适应滑模控制

多目标优化算法matlab

SQP算法 java代码 有输出

pdf robust optimization. princeton university press, 2009

MATLAB toolbox目录

The Biobjective Bike-Sharing Rebalancing Problem with Balance Intervals: A Multistart Multiobjective Particle Swarm Optimization Algorithm、

最新资源

File "src\gurobipy\var.pxi", line 125, in gurobipy.Var.getattr File "src\gurobipy\var.pxi", line 153, in gurobipy.Var.getAttr File "src\gurobipy\attrutil.pxi", line 100, in gurobipy.__getattr AttributeError: Unable to retrieve attribute 'x'. Did y

SQP算法 java代码有输出