3
Vulcan
Binary transformation in a distributed environment
Amitabh Srivastava
Microsoft Research
One Microsoft Way
Redmond, WA
amitabhs@microsoft.com
Andrew Edwards
Microsoft Research
One Microsoft Way
Redmond, WA
andred@microsoft.com
Hoi Vo
Microsoft Research
One Microsoft Way
Redmond, WA
hoiv@microsoft.com
ABSTRACT
Distributed computing on the Internet presents new
challenges and opportunities for tools that inspect and
modify program binaries. The dynamic and
heterogeneous nature of the Internet environment extends
the traditional product development process by requiring
program development tools like these, which were once
used only internally, to work in live environments too.
The concept of compilation process must be expanded
along with the capabilities of the binary tools. This paper
presents Vulcan, a second-generation technology that
addresses many of these challenges. Vulcan provides
both static and dynamic code modification and provides a
framework for cross-component analysis and
optimization. It provides system-level analysis for
heterogeneous binaries across instruction sets. Vulcan
works in the Win32 environment and can process x86,
IA64, and MSIL binaries. Vulcan scales to large
commercial applications and has been used to improve
performance and reliability of Microsoft products in a
production environment.
1. INTRODUCTION
In recent years, binary instrumentation and optimization
tools (hereafter called “binary tools”) have been
effectively used to understand and improve the
performance of significant programs[23][24]. Because
they are new, binary tools are typically not well integrated
with the existing compiler framework. (They often rely
on slightly modified executable formats, so that relocation
information is retained, and code and data can be easily
distinguished in the executable.)
At the same time, the dynamic and heterogeneous nature
of Internet computing has challenged the traditional
compilation model, presenting great opportunities to
expand the role of binary tools in improving performance
and reliability of software. Vulcan is a second-generation
technology that addresses many of the challenges of this
new generation of computing. To understand the role of
Vulcan, we first describe how the traditional compilation
framework has changed.
In the past, the compilation process has focused simply on
turning source code into executables, balancing
compilation speed against code optimization. The static
compiler turns source file into object files, followed by
the linker that combines the object files to produce the
final executable. Very little program information is
preserved after the link stage, mostly for debugging and
support.
The following binary modification stage was not designed
as part of the original compilation process. Binary tools
“hacked” their way into the compilation process by
intercepting and transforming executables that had
already been compiled and linked. The binary
modification stage thereby provided language
independence and a natural environment for whole
program analysis and architecture specific
transformations without requiring recompilation.
The new distributed computing model of the Internet
presents new challenges for software development tools:
its heterogeneous nature forces applications to be built
with components in multiple instruction sets, and its
dynamic nature extends the traditonal product
development process to live environments.
The heterogeneous nature of the Internet requires certain
compilation phases like code generation and optimization
to be delayed until run time. Programs can be compiled
to architecture-independent languages like MSIL
1
(Microsoft Intermediate Language[19]), with final
optimization and code generation performed at run time.
As shown in Figure 1, parts of a heterogeneous
application may still exist in MSIL while other parts may
1
MSIL is Microsoft’s intermediate language for the managed
environment. A number of languages such as C#, VB, Cobol
etc. can be compiled to MSIL. MSIL is converted to native
code by JIT compilers.