优化动态语言：部分求值在追踪JIT中的对象分配移除

126 浏览量更新于2024-08-25 收藏 420KB PDF 举报

"Allocation Removal by Partial Evaluation in a Tracing JIT - 2010 (bolz-allocation-removal)-计算机科学" 这篇论文探讨了在追踪式即时编译器（Tracing JIT）中通过部分评估（Partial Evaluation）来消除内存分配和运行时类型检查的技术，以提升动态语言的性能。动态语言由于频繁的内存分配和类型检查，其性能往往受到限制，这使得它们在处理纯算法问题上不如静态类型语言。论文作者包括Carl Friedrich Bolza、Antonio Cunia、Maciej Fijałkowski、Michael Leuschel、Samuele Pedroni和Armin Rigo，他们分别来自德国海德堡大学的STUPS小组、merlinux GmbH和Open End公司。部分评价是一种编译优化技术，它分析程序的执行路径，提前计算出可确定的部分，从而减少运行时的开销。在追踪JIT的上下文中，部分评价可以识别并消除不必要的对象分配和类型检查，这将显著提升代码执行效率。论文中，研究人员用Python虚拟机作为实验平台，对优化技术进行了评估，并发现该方法在所有实际基准测试中都取得了良好的结果。论文分类属于编程语言的处理器领域，具体是代码生成和解释器、运行时环境。研究涉及的主要方面是语言和性能。该工作的核心目标是改进动态语言的执行效率，使其在处理各种任务时能够更接近静态类型语言的性能水平，尤其是在处理大规模数据和复杂运算的场景下。论文内容可能涵盖了以下知识点： 1. **追踪式即时编译器(Tracing JIT)**：这是一种优化技术，它能够在程序运行时动态地将热点代码转换为机器码，以提高性能。追踪JIT通过追踪程序中的循环和其他热点路径，将这些代码片段转化为高效的本地代码。 2. **部分评估(Partial Evaluation)**：这是一种编译技术，它结合了程序和特定输入，生成一个新的、针对特定任务优化过的程序。在本文的上下文中，部分评估被用来提前进行类型推断和优化，减少运行时的开销。 3. **内存分配(Allocation)**：在动态语言中，内存分配通常是频繁发生的，特别是在创建大量临时对象时。这部分操作会带来额外的性能负担，因为它们涉及到内存管理，如垃圾回收。 4. **运行时类型检查(Run-time Type Checks)**：在动态类型的语言中，每次操作都可能需要进行类型检查，以确保操作的正确性。这种检查在某些情况下可能会成为性能瓶颈。 5. **优化效果评估**：通过对比优化前后的基准测试，论文展示了部分评价和追踪JIT如何协同工作，以提高Python等动态语言的性能。 6. **应用实例与基准测试**：论文可能包含了一系列真实世界的基准测试，这些测试用于验证优化技术的有效性和适用性。 7. **动态语言的性能挑战**：论文讨论了动态语言在性能上的局限性，以及如何通过编译器优化来克服这些挑战。通过对这些概念的深入理解和实现，开发者可以设计出更高效、更适合大规模计算的动态语言实现，进一步推动动态语言在各个领域的应用。

Allocation Removal by Partial Evaluation in a Tracing JIT

Carl Friedrich Bolz

Antonio Cuni

Maciej Fijałkowski

Michael Leuschel

Samuele Pedroni

Armin Rigo

Heinrich-Heine-Universität Düsseldorf, STUPS Group, Germany

merlinux GmbH, Hildesheim, Germany

Open End, Göteborg, Sweden

cfbolz@gmx.de anto.cuni@gmail.com jal@merlinux.eu leuschel@cs.uni-duesseldorf.de

samuele.pedroni@gmail.com arigo@tunes.org

Abstract

The performance of many dynamic language implementations suf-

fers from high allocation rates and runtime type checks. This makes

dynamic languages less applicable to purely algorithmic problems,

despite their growing popularity. In this paper we present a simple

compiler optimization based on online partial evaluation to remove

object allocations and runtime type checks in the context of a trac-

ing JIT. We evaluate the optimization using a Python VM and ﬁnd

that it gives good results for all our (real-life) benchmarks.

Categories and Subject Descriptors D.3.4 [Programming Lan-

guages]: Processors—code generation, interpreters, run-time envi-

ronments

General Terms Languages, Performance, Experimentation

Keywords Tracing JIT, Partial Evaluation, Optimization

1. Introduction

The objective of a just-in-time (JIT) compiler for a dynamic lan-

guage is to improve the speed of the language over an implementa-

tion of the language that uses interpretation. The ﬁrst goal of a JIT

is therefore to remove the interpretation overhead, i.e. the overhead

of bytecode (or AST) dispatch and the overhead of the interpreter’s

data structures, such as operand stack etc. The second important

problem that any JIT for a dynamic language needs to solve is how

to deal with the overhead of boxing primitive types and of type dis-

patching. Those are problems that are usually not present or at least

less severe in statically typed languages.

Boxing of primitive types is necessary because dynamic lan-

guages need to be able to handle all objects, even integers, ﬂoats,

booleans etc. in the same way as user-deﬁned instances. Thus those

primitive types are usually boxed, i.e., a small heap-structure is al-

located for them that contains the actual value. Boxing primitive

types can be very costly, because a lot of common operations, par-

ticularly all arithmetic operations, have to produce new boxes, in

This research is partially supported by the BMBF funded project PyJIT

(nr. 01QE0913B; Eureka Eurostars).

[Copyright notice will appear here once ’preprint’ option is removed.]

addition to the actual computation they do. Because the boxes are

allocated on the heap, producing many of them puts pressure on the

garbage collector.

Type dispatching is the process of ﬁnding the concrete imple-

mentation that is applicable to the objects at hand when performing

a generic operation on them. An example would be the addition of

two objects: For addition the types of the concrete objects need to

be checked and the suiting implementation chosen. Type dispatch-

ing is a very common operation in modern

dynamic languages be-

cause no types are known at compile time. Therefore all operations

need it.

A recently popular approach to implementing just-in-time com-

pilers for dynamic languages is that of a tracing JIT. A tracing JIT

works by observing the running program and recording its hot spots

into linear execution traces. Those traces are optimized and turned

into machine code.

One reason for the popularity of tracing JITs is their relative

simplicity. They can often be added to an existing interpreter,

reusing a lot of the interpreter’s infrastructure. They give some

important optimizations like inlining and constant-folding for free.

A tracing JIT always produces linear pieces of code, which sim-

pliﬁes many of the hard algorithms in a compiler, such as register

allocation.

The use of a tracing JIT can remove the overhead of bytecode

dispatch and that of the interpreter data structures. In this paper

we want to present a new optimization that can be added to a

tracing JIT that further removes some of the overhead more closely

associated to dynamic languages, such as boxing overhead and

type dispatching. Our experimental platform is the PyPy project,

which is an environment for implementing dynamic programming

languages. PyPy and tracing JITs are described in more detail

in Section 2. Section 3 analyzes the problem to be solved more

closely.

The core of our trace optimization technique can be viewed as

partial evaluation: the partial evaluation performs a form of escape

analysis [4] on the traces and makes some objects that are allocated

in the trace static, which means that they do not occur any more

in the optimized trace. This technique is informally described in

Section 4; a more formal description is given in Section 5. The

introduced techniques are evaluated in Section 6 using PyPy’s

Python interpreter.

The contributions made by this paper are:

For languages in the LISP family, basic arithmetic operations are typically

not overloaded; even in Smalltalk, type dispatching is much simpler than in

Python or JavaScript.

1 2010/10/22

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38538950

粉丝: 4
资源: 930

优化动态语言：部分求值在追踪JIT中的对象分配移除

Resource-allocation-in-Cognitive-Radio--master.zip_cognitive rad

A Few Experiments with Intel's Cache Allocation Technology - Slides (2015)-计算机科学

Scalable Lock-Free Dynamic Memory Allocation (pldi-2004)-计算机科学

DL-Adaptive-Resource-Allocation-for-a-Multi-user-_通讯编程_matlab_

Distributed SNR-Based Power Allocation in Wireless Parallel Amplify-and-Forward Relay Transmissions Using Cournot Game

Futures-Trading-Strategy-in-asset-allocation-using-Entropy-Pooling

matlab做趋势的代码-Deep-Learning-Power-Allocation-in-Massive-MIMO:这是与以下科学文章相关

matlab资源分配的代码-Optimal-Resource-Allocation-with-Node-and-Link-Capacity-C

Dynamic-memory-allocation:计算机操作系统-动态内存分配

Optimal power allocation for downlink two-user non-orthogonal multiple access in visible light communication

最新资源