ISO/IEC 14496-15:2013(E)
© ISO/IEC 2013 – All rights reserved
xi
While the use of shadow sync is supported for backward compatibility reasons, this use is deprecated and use
of the mechanisms defined in 5.4.6 is recommended.
4.10 Sample groups on random access recovery points and random access points
The video coding system can include the concept of a ‘gradual decoding refresh’ or random access recovery
point. This may be signalled in the bit-stream using a mechanism such as the recovery point SEI message.
This message is found at the beginning of the random access, and indicates how much data must be decoded
subsequent to the access unit at the position of the SEI message before the recovery is complete.
When all access units in output order starting from the access unit at the position of the SEI message can be
successfully decoded after random access, i.e. when the recovery_frame_cnt syntax element of the recovery
point SEI message is 0, the Random Access Point (‘rap ‘) sample grouping should be used.
This concept of gradual recovery is supported in the file format also by using RollRecoveryEntry Groups [4.5].
In order that the group membership marks the sample containing the SEI message the ‘roll-distance’ is
constrained to being only positive (i.e. a post-roll). In other words, RollRecoveryEntry Groups can be used
when the value of the recovery_frame_cnt syntax element of the recovery point SEI message is greater than 0.
Note – The roll-group counts samples in the file format; this may not match the way that the distances
are represented in the SEI message.
Within a stream, it is necessary to mark the beginning of the pre-roll, so that a stream decoder may start
decoding there. However, in a file, when performing random access, a deterministic search is desired for the
closest preceding frame which can be decoded perfectly (either a sync sample, or the end of a pre-roll).
4.11 Hinting
Note that what the hint tracks call “B frames” are actually ‘disposable’ pictures or non-reference pictures, for
example as defined in ISO/IEC 14496-10.
Care should be taken when the structures in Annex A (aggregators or extractors) are in use and the track is
hinted. These structures are defined only for use in the file format and should not be transmitted. In particular,
a hint track that points at an extractor in a video track would cause the extractor itself to be transmitted (which
is probably both incorrect and not the desired behaviour), not the data the extractor references. Hint tracks
should normally directly reference NAL units specified in the applicable video coding standard.
5 AVC elementary streams and sample definitions
5.1 Introduction
The Advanced Video Coding (AVC) standard, jointly developed by the ITU-T and
ISO/IEC JTC 1/SC 29/WG 11 (MPEG), offers not only increased coding efficiency and enhanced robustness,
but also many features for the systems that use it. To enable the best visibility of, and access to, those
features, and to enhance the opportunities for the interchange and interoperability of media, this part of
ISO/IEC 14496 defines a storage format for video streams compressed using AVC.
This clause defines the storage for plain AVC streams, where ‘plain AVC’ refers to the main part of
ISO/IEC 14496-10, excluding Annex G (Scalable Video Coding) and Annex H (Multiview Video Coding).
This clause specifies the elementary stream and sample structure used to store AVC visual content.
The storage of AVC content uses the existing capabilities of the ISO base media file format but also defines
extensions to support the following features of the AVC codec.
Switching pictures:
to enable switching between different coded streams and substitution of pictures within the same stream.