A Proposal for Hardware-Assisted Arithmetic Overflow
Detection for Array and Bitfield Operations
Darek Mihocka
Intel Corp.
Darek.Mihocka@Intel.com
Jens Troeger
Intel Corp.
Jens.Troeger@Intel.com
Abstract
Detecting arithmetic overflow during summation operations
is vital to ensuring correct and secure behavior of many
types of code. For example, applying transformations to
signed integer pixel co-ordinates without any overflow
detection may result in pixels rendering at unexpected
negative co-ordinates, summing a large array of signed or
unsigned integers without overflow detection can result in
bogus totals, or performing arithmetic operations on packed
bitfields without overflow detection could result in
corruption of data in adjacent bitfields.
A traditional way to detect arithmetic overflow is to insert
specific checks of the host processor’s Overflow arithmetic
condition flag after each arithmetic operation to detect
signed integer overflow, or a check of the host processor’s
Carry arithmetic flag to detect unsigned integer overflow.
The C# language for example includes the keyword
“checked”
1
which directs the code generator to inject such
checks. Similarly, some versions of the gcc native compiler
support a switch to generate such checks.
2
A drawback of this approach is that since it relies on the
host arithmetic flags to detect arithmetic overflow, the
addition or subtraction operation must immediately be
followed by the check as well as by some conditional
branch or trap for when the overflow is detected. This
severely serializes the ability to parallelize array summation
or packed bitfield arithmetic, as it requires a separate test
and branch for each arithmetic operation.
Modern processors such as the Intel Core and Intel Atom
series offset SIMD extensions which allow for packed data
operations on signed and unsigned integers.
3
This permits
up to 8 additions or subtractions to be performed in one
operation, with overflow detection (in the form of a
Saturate status bit). However, even the SIMD approach is
limited to the data sizes and operations supported by the
particular SIMD instruction set.
An additional problem for either the traditional arithmetic
flags or SIMD based overflow detection mechanisms is that
they cannot operate on small or unconventional data sizes.
The Intel Larrabee
4
SIMD instruction set for example, only
supports packed 32-bit and 64-bit integer types, and
therefore cannot operate on 8-bit or 16-bit integers directly
without additional data conversion operations.
Neither approach can operate directly on bitfield data types
such as the common 5-6-5 packing of RGB pixel values, or
something even more trivial such as incrementing a 4-bit
bitfield, since the smallest arithmetic flags generating
operations on most processors require data elements at least
8 bits wide.
This paper examines an alternative and purely integer-based
“lazy flags”
5
method of detecting signed and unsigned
arithmetic overflow conditions which decouples the
generation of the sums (or differences) from the detection
of the arithmetic overflow in a manner that is not dependent
on the specific integer data size capabilities of the host
processor. Such a decoupling allows for the vectorization
of both the arithmetic operation as well as the overflow
detection without the need for specific SIMD extensions.
This has practical uses for runtime written in high-level
languages such as bytecode interpreters and simulators
where direct access to hardware arithmetic flags or SIMD
extensions is difficult.
This paper will then propose simple instruction set
extensions for implementing the lazy flags approach to
allow for hardware-assisted acceleration of overflow
detection across bitfields and arrays without the data size
restrictions of existing integer or SIMD implementations.
Keywords
Lazy Flags, Arithmetic Overflow, Hardware Acceleration