基于上下文的自适应二进制算术编码在H.264/AVC视频压缩中的应用

需积分: 5 170 浏览量更新于2024-07-09 收藏 480KB PDF 举报

在2003年由Marpe和Detlev等人撰写的学术论文《IEEE Transactions on Circuits & Systems for Video Technology》中，探讨了"基于上下文的自适应二进制算术编码（Context-Based Adaptive Binary Arithmetic Coding, CABAC）"在H.264/AVC视频压缩标准中的应用。这项技术作为新制定的国际电联（ITU-T）和国际标准化组织/国际电工委员会（ISO/IEC）标准的重要组成部分，旨在通过结合自适应二进制算术编码技术和上下文建模，实现高度的适应性和冗余减少。 CABAC的核心思想是将传统的二进制算术编码与上下文信息相结合，这使得编码过程能够根据当前数据序列的概率特性进行动态调整。这种适应性使得编码效率显著提高，尤其适用于那些在广播应用中常见的、具有非平稳统计特性的自然场景视频信号。通过概率模型和上下文分析，CABAC能够在编码过程中更准确地估计每个符号的熵，从而减少编码所需比特的数量。相比于H.264/AVC的基础编码方法，CABAC在典型应用场景下展现了明显的优势。例如，在一组代表广播常用材料的测试序列上，当视频质量保持在约30到38分贝的可接受范围内时，CABAC能够实现平均比特率节省大约9%到14%。这表明，CABAC不仅提高了编码效率，还优化了存储和传输所需的带宽资源。文章的索引关键词包括CABAC（自适应算术编码）、熵编码、上下文建模、二进制算术编码以及H.264和MPEG-4 AVC（高级视频编码标准）。对于视频编码领域来说，这篇论文的重要性在于它提供了一种创新且高效的编码框架，对后续的视频压缩技术发展产生了深远影响，推动了编码算法在硬件和软件实施上的优化。随着H.264/AVC的广泛应用，CABAC成为了现代视频压缩技术中的基石，尤其是在高清和超高清视频处理中。

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. X, NO. Y, MONTH 2003

Fig. 2. Illustration of the binarization for mb_type (left) and sub_mb_type

(right) both for P/SP slices.

Although at this stage nothing seems to be gained, there

is already the advantage of using a binary arithmetic coding

engine on the bin string instead of an m-ary arithmetic coder

operating on the original m-ary source alphabet. Adaptive

m-ary arithmetic coding (for m > 2) is in general a computa-

tionally complex operation requiring at least two multiplica-

tions for each symbol to encode as well as a number of

fairly complex operations to perform the update of the

probability estimation [36]. In contrast to that, there are fast,

multiplication-free variants of binary arithmetic coding, one

of which was specifically developed for the CABAC

framework, as further described below. Since the probabil-

ity of symbols with larger bin strings is typically very low,

the computational overhead of coding all bins of that bin

string instead of using only one pass in an m-ary arithmetic

coder is fairly small and can be easily compensated by using

a fast binary coding engine.

Finally, as the most important advantage, binarization en-

ables context modeling on a sub-symbol level. For specific

bins, which, in general, are represented by the most fre-

quently observed bins, conditional probabilities can be

used, whereas other, usually less frequently observed bins

can be treated using a joint, typically zero-order probability

model. Compared to the conventional approach of using

context models in the original domain of the source with

typically large alphabet size (like e.g. components of motion

vector differences or transform coefficient levels) this addi-

tional freedom in the design offers a flexible instrument for

using higher-order conditional probabilities without suffer-

ing from context “dilution” effects. These effects are often

observed in cases, where a large number of conditional

probabilities have to be adaptively estimated on a relatively

small (coding) time interval, such that there are not enough

samples to reach a reliable estimate for each model.

For instance, when operating in the original alphabet do-

main, a quite moderately chosen 2

order model for a given

syntax element alphabet of size m = 256 will result in the in-

tractably large number of 256

⋅ (256 − 1) ≈ 2

symbol

probabilities to be estimated for that particular syntax ele-

ment only. Even for a zero-order model, the task of tracking

255 individual probability estimates according to the previ-

A more rigorous treatment of that problem can be found in [23][24].

ous example is quite demanding. However, typically meas-

ured probability density functions (pdf) of prediction re-

siduals or transformed prediction errors can be modeled by

highly peaked Laplacian, generalized Gaussian or geometric

distributions [28], where it is reasonable to restrict the esti-

mation of individual symbol statistics to the area of the

largest statistical variations at the peak of the pdf. Thus, if,

for instance, a binary tree resulting from a Huffman code

design would be chosen as a binarization for such a source

and its related pdf, only the nodes located in the vicinity of

the root node would be natural candidates for being mod-

eled individually, whereas a joint model would be assigned

to all nodes on deeper tree levels corresponding to the “tail”

of the pdf. Note that this design is different from the exam-

ple given in Fig. 2, where each (internal) node has its own

model.

In the CABAC framework, typically, only the root node

would be modeled using higher-order conditional probabili-

ties. In the above example this would result for a 2

order

model in only 4 different binary probability models instead

of m

different m-ary probability models with m = 256.

2) Design of CABAC Binarization Schemes

As already indicated above, a binary representation for a

given non-binary valued syntax element provided by the bi-

narization process should be close to a minimum-

redundancy code. On the one hand, this allows to easily ac-

cessing the most probable symbols by means of the binary

decisions located at or close to the root node for the subse-

quent modeling stage. On the other hand, such a code tree

minimizes the number of binary symbols to encode on the

average, hence minimizing the computational workload in-

duced by the binary arithmetic coding stage.

However, instead of choosing a Huffman tree for a given

training sequence, the design of binarization schemes in

CABAC (mostly) relies on a few basic code trees, whose

structure enables a simple on-line computation of all code

words without the need for storing any tables. There are

four such basic types: the unary code, the truncated unary

code, the k

order Exp-Golomb code and the fixed-length

code. In addition, there are binarization schemes based on a

concatenation of these elementary types. As an exception of

these structured types, there are five specific, mostly un-

structured binary trees that have been manually chosen for

the coding of macroblock types and sub-macroblock types.

Two examples of such trees are shown in Fig. 2.

In the remaining part of this section, we explain in more

detail the construction of the four basic types of binarization

and its derivatives.

Unary and Truncated Unary Binarization Scheme: For

each unsigned integer valued symbol x ≥ 0 the unary code

word in CABAC consists of x “1” bits plus a terminating

“0” bit. The truncated unary (TU) code is only defined for x

with 0 ≤ x ≤ S, where for x < S the code is given by the

unary code, whereas for x = S the terminating “0” bit is ne-

glected such that the TU code of x = S is given by codeword

剩余17页未读，继续阅读

图像实验室

粉丝: 3
资源: 8

基于上下文的自适应二进制算术编码在H.264/AVC视频压缩中的应用

Context-Based Adaptive Binary Arithmetic Coding in the H.264 AVC Video Compression Standard

Context-based_adaptive_binary_arithmetic_coding.ra_264_H.264_H.2

H.265/HEVC：超越H.264/AVC的高清视频编码技术解析

H.264/avc经典教程

H.264/avc reference software JM

C语言实现H.264/AVC指数哥伦布编码

H.264/AVC标准中文对照解析

H.264/AVC视频编码标准详解

H.264/AVC标准熵编码技术解析

H.264/AVC视频压缩标准-CABAC概率模型解析

最新资源