IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. X, NO. Y, MONTH 2003
4
Fig. 2. Illustration of the binarization for mb_type (left) and sub_mb_type
(right) both for P/SP slices.
Although at this stage nothing seems to be gained, there
is already the advantage of using a binary arithmetic coding
engine on the bin string instead of an m-ary arithmetic coder
operating on the original m-ary source alphabet. Adaptive
m-ary arithmetic coding (for m > 2) is in general a computa-
tionally complex operation requiring at least two multiplica-
tions for each symbol to encode as well as a number of
fairly complex operations to perform the update of the
probability estimation [36]. In contrast to that, there are fast,
multiplication-free variants of binary arithmetic coding, one
of which was specifically developed for the CABAC
framework, as further described below. Since the probabil-
ity of symbols with larger bin strings is typically very low,
the computational overhead of coding all bins of that bin
string instead of using only one pass in an m-ary arithmetic
coder is fairly small and can be easily compensated by using
a fast binary coding engine.
Finally, as the most important advantage, binarization en-
ables context modeling on a sub-symbol level. For specific
bins, which, in general, are represented by the most fre-
quently observed bins, conditional probabilities can be
used, whereas other, usually less frequently observed bins
can be treated using a joint, typically zero-order probability
model. Compared to the conventional approach of using
context models in the original domain of the source with
typically large alphabet size (like e.g. components of motion
vector differences or transform coefficient levels) this addi-
tional freedom in the design offers a flexible instrument for
using higher-order conditional probabilities without suffer-
ing from context “dilution” effects. These effects are often
observed in cases, where a large number of conditional
probabilities have to be adaptively estimated on a relatively
small (coding) time interval, such that there are not enough
samples to reach a reliable estimate for each model.
2
For instance, when operating in the original alphabet do-
main, a quite moderately chosen 2
nd
order model for a given
syntax element alphabet of size m = 256 will result in the in-
tractably large number of 256
2
⋅ (256 − 1) ≈ 2
24
symbol
probabilities to be estimated for that particular syntax ele-
ment only. Even for a zero-order model, the task of tracking
255 individual probability estimates according to the previ-
2
A more rigorous treatment of that problem can be found in [23][24].
ous example is quite demanding. However, typically meas-
ured probability density functions (pdf) of prediction re-
siduals or transformed prediction errors can be modeled by
highly peaked Laplacian, generalized Gaussian or geometric
distributions [28], where it is reasonable to restrict the esti-
mation of individual symbol statistics to the area of the
largest statistical variations at the peak of the pdf. Thus, if,
for instance, a binary tree resulting from a Huffman code
design would be chosen as a binarization for such a source
and its related pdf, only the nodes located in the vicinity of
the root node would be natural candidates for being mod-
eled individually, whereas a joint model would be assigned
to all nodes on deeper tree levels corresponding to the “tail”
of the pdf. Note that this design is different from the exam-
ple given in Fig. 2, where each (internal) node has its own
model.
In the CABAC framework, typically, only the root node
would be modeled using higher-order conditional probabili-
ties. In the above example this would result for a 2
nd
order
model in only 4 different binary probability models instead
of m
2
different m-ary probability models with m = 256.
2) Design of CABAC Binarization Schemes
As already indicated above, a binary representation for a
given non-binary valued syntax element provided by the bi-
narization process should be close to a minimum-
redundancy code. On the one hand, this allows to easily ac-
cessing the most probable symbols by means of the binary
decisions located at or close to the root node for the subse-
quent modeling stage. On the other hand, such a code tree
minimizes the number of binary symbols to encode on the
average, hence minimizing the computational workload in-
duced by the binary arithmetic coding stage.
However, instead of choosing a Huffman tree for a given
training sequence, the design of binarization schemes in
CABAC (mostly) relies on a few basic code trees, whose
structure enables a simple on-line computation of all code
words without the need for storing any tables. There are
four such basic types: the unary code, the truncated unary
code, the k
th
order Exp-Golomb code and the fixed-length
code. In addition, there are binarization schemes based on a
concatenation of these elementary types. As an exception of
these structured types, there are five specific, mostly un-
structured binary trees that have been manually chosen for
the coding of macroblock types and sub-macroblock types.
Two examples of such trees are shown in Fig. 2.
In the remaining part of this section, we explain in more
detail the construction of the four basic types of binarization
and its derivatives.
Unary and Truncated Unary Binarization Scheme: For
each unsigned integer valued symbol x ≥ 0 the unary code
word in CABAC consists of x “1” bits plus a terminating
“0” bit. The truncated unary (TU) code is only defined for x
with 0 ≤ x ≤ S, where for x < S the code is given by the
unary code, whereas for x = S the terminating “0” bit is ne-
glected such that the TU code of x = S is given by codeword