A 256 Kbits L-TAGE branch predictor
Andr´e Seznec
IRISA/INRIA/HIPEAC
Abstract
The TAGE predictor, TAgged GEometric length predictor,
was introduced in [10].
TAGE relies on several predictor tables indexed through
independent functions of the global branch/path history and
the branch address. The TAGE predictor uses (partially)
tagged components as the PPM-like predictor [5]. It relies
on (partial) match as the prediction computation function.
TAGE also uses GEometric history length as the O-GEHL
predictor [6], i.e. , the set of used global history lengths
forms a geometric series, i.e.,
. This al-
lows to efficiently capture correlation on recent branch out-
comes as well as on very old branches.
For the realistic track of CBP-2, we present a L-TAGE
predictor consisting of a 13-component TAGE predictor
combined with a 256-entry loop predictor. This predictor
achieves 3.314 misp/KI on the set of distributed traces.
Presentation outline
We first recall the TAGE predictor principles [10] and its
main characteristics. Then, we describe the L-TAGE con-
figuration submitted to CBP-2 combining a loop predictor
and a TAGE predictor. Section 3 discusses implementation
issues on the L-TAGE predictor. Section 4 presents simula-
tion results for the submitted L-TAGE predictor and a few
other TAGE predictor configurations. Section 5 briefly re-
views the related works that had major influences in the L-
TAGE predictor proposition and discusses a few tradeoffs
that might influence the choice of a TAGE configuration for
an effective implementation.
1. The TAGE conditional branch predictor
The TAGE predictor is derived from Michaud’s PPM-
like tag-based branch predictor [5] and uses geometric his-
tory lengths [6]. Figure 1 illustrates a TAGE predictor. The
TAGE predictor features a base predictor T0 in charge of
providing a basic prediction and a set of (partially) tagged
This work was partially supported by an Intel research grant, an Intel
research equipment donation and by the European Commission in the
context of the SARC integrated project #27648 (FP6).
predictor components Ti. These tagged predictor compo-
nents Ti,
are indexed using different his-
tory lengths that form a geometric series, i.e,
"!#$&%'
)(+*-,/.0
.
Throughout this paper, the base predictor will be a sim-
ple PC-indexed 2-bit counter bimodal table; in order to save
storage space, the hysteresis bit is shared among several
counters as in [7].
An entry in a tagged component consists in a signed
counter ctr which sign provides the prediction, a (partial)
tag and an unsigned useful counter u. Throughout this pa-
per, u is a 2-bit counter and ctr is a 3-bit counter.
A few definitions and notations The provider component is
the matching component with the longest history. The al-
ternate prediction altpred is the prediction that would have
occurred if there had been a miss on the provider compo-
nent.
If there is no hitting component then altpred is the de-
fault prediction.
1.1. Prediction computation
At prediction time, the base predictor and the tagged
components are accessed simultaneously. The base predic-
tor provides a default prediction. The tagged components
provide a prediction only on a tag match.
In the general case, the overall prediction is provided by
the hitting tagged predictor component that uses the longest
history, or in case of no matching tagged predictor compo-
nent, the default prediction is used.
However, we found that, on several applications, using
the alternate prediction for newly allocated entries is more
efficient. Our experiments showed this property is essen-
tially global to the application and can be dynamically mon-
itored through a single 4-bit counter (USE ALT ON NA in
the simulator). On the predictor an entry is classified as
“newly allocated” if its prediction counter is weak.
Therefore the prediction computation algorithm is as fol-
lows:
1. Find the matching component with the longest history
2. if (the prediction counter is not weak or
USE ALT ON NA is negative) then the predic-