www.vcodex.com H.264 / MPEG-4 Part 10 : Inter Prediction
© Iain E G Richardson 30/04/03 Page 1 of 3
H.264 / MPEG-4 Part 10 White Paper
Prediction of Inter Macroblocks in P-slices
1. Introduction
The Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG are finalising a new standard for
the coding (compression) of natural video images. The new standard [1] will be known as H.264 and
also MPEG-4 Part 10, “Advanced Video Coding”. This document describes the methods of predicting
inter-coded macroblocks in P-slices in an H.264 CODEC.
Inter prediction creates a prediction model from one or more previously encoded video frames. The
model is formed by shifting samples in the reference frame(s) (motion compensated prediction). The
AVC CODEC uses block-based motion compensation, the same principle adopted by every major
coding standard since H.261. Important differences from earlier standards include the support for a
range of block sizes (down to 4x4) and fine sub-pixel motion vectors (1/4 pixel in the luma
component).
2. Tree structured motion compensation
AVC supports motion compensation block sizes ranging from 16x16 to 4x4 luminance samples with
many options between the two. The luminance component of each macroblock (16x16 samples) may
be split up in 4 ways as shown in Figure 2-1: 16x16, 16x8, 8x16 or 8x8. Each of the sub-divided
regions is a macroblock partition. If the 8x8 mode is chosen, each of the four 8x8 macroblock
partitions within the macroblock may be split in a further 4 ways as shown in Figure 2-2: 8x8, 8x4,
4x8 or 4x4 (known as macroblock sub-partitions). These partitions and sub-partitions give rise to a
large number of possible combinations within each macroblock. This method of partitioning
macroblocks into motion compensated sub-blocks of varying size is known as tree structured motion
compensation.
0 0 1
0
1
0 1
2 3
16
16
8 8
Figure 2-1 Macroblock partitions: 16x16, 8x16, 16x8, 8x8
0 0 1
0
1
0 1
2 3
8
8
4 4
Figure 2-2 Macroblock sub-partitions: 8x8, 4x8, 8x4, 4x4
A separate motion vector is required for each partition or sub-partition. Each motion vector must be
coded and transmitted; in addition, the choice of partition(s) must be encoded in the compressed
bitstream. Choosing a large partition size (e.g. 16x16, 16x8, 8x16) means that a small number of bits
are required to signal the choice of motion vector(s) and the type of partition; however, the motion
compensated residual may contain a significant amount of energy in frame areas with high detail.
Choosing a small partition size (e.g. 8x4, 4x4, etc.) may give a lower-energy residual after motion
compensation but requires a larger number of bits to signal the motion vectors and choice of
partition(s). The choice of partition size therefore has a significant impact on compression