• Picture parameter set (PPS): Contains information which may vary on a per-picture basis. Examples of
information contained in the PPS are: quantization parameter and flags indicating the use of particular coding
tools (e.g. Transform Skip).
Parameter sets may reference other parameter sets, specifically, a PPS has an ID indicating the associated SPS and an
SPS has an ID indicating the associated VPS. Regardless of these associations, to facilitate parsing robustness, each
parameter set can be parsed independently, i.e. there is no conditional dependency in the syntax parsing on information
present in any associated parameter set.
4.1.3 Picture types
Random access functionality is provided using intra random access point (IRAP) pictures. An IRAP picture can only
contain one or more I-slices. HEVC defines three types of IRAP. These three picture types are:
• Instantaneous decoding refresh (IDR)
• Clean random access (CRA)
• Broken link access (BLA)
An IDR picture, when encountered, results in flushing of the decoded picture buffer (DPB). IDR pictures provide RAP
functionality but sacrifice coding performance because frames decoded prior to an IDR picture are no longer available
for reference in inter coding. In order to allow random access to the content and maintain the coding performance, HEVC
defines CRA pictures. CRA pictures are intra coded but, when encountered, do not empty the DPB. Consequently,
pictures following a CRA picture in decoding order can still use reference pictures that precede the CRA picture in
decoding order. Leading pictures may follow a CRA picture; leading pictures can either be decoded or skipped. Pictures
following a CRA in decoding order and correctly decodable are called random access decodable leading (RADL)
pictures. Pictures that follow a CRA picture in decoding order but cannot be correctly decoded without preceding
reference frames having also been decoded are called random access skipped leading (RASL) pictures.
One example use case for this functionality is splicing bitstreams to insert advertisements in a television programme.
Consider the case when Bitstream 1 (B1) and Bitstream 2 (B2) are concatenated as B1·B2 and the picture which starts
the segment associated to B2 is a CRA. All RASL pictures following the CRA picture in decoding order in B2 cannot be
correctly decoded because their associated reference pictures are not present in the DPB. These RASL pictures should be
discarded from the decoder output and this is accomplished by the splicing operation declaring the CRA picture in B2 to
be a BLA picture. In this case the decoder knows that all RASL associated with this BLA picture will not be displayed.
Finally, HEVC also defines two additional types of pictures to support temporal scalability:
• Temporal sublayer access (TSA)
• Step-wise temporal sub-layer access (STSA)
These pictures impose restrictions on the reference used between different temporal layers so that temporal down-
switching and up-switching operations can be made possible (see Figure 5 in [5] for an example on the use of TSA and
STSA pictures).
4.1.4 Reference picture set
The reference picture set (RPS) has been introduced in HEVC to handle reference pictures in the DPB. In fact, when a
picture is no longer used for reference by other pictures, it should be discarded from the DPB. If instead a picture is used
as reference for future pictures it must be kept in the DPB to correctly decode the bitstream. The RPS contains
information on the status of the DPB and may be signalled in the SPS, and additionally signalled, or overridden, in the
slice header. The signalling is absolute, i.e. each RPS describes the DPB status and does not refer to any previous status
for its description. In this way, bitstream error resilience is improved even when some NAL units are lost.
4.2 Picture partitioning
4.2.1 Coding tree unit (CTU) partitioning
Pictures are divided into a sequence of coding tree units (CTUs), all being the same size, and each covering a square
pixel region of the picture. An example of a picture divided into CTUs is shown in Figure 4-1. The size of a CTU is
specified with respect to the luma channel, to prevent ambiguity when considering chroma formats.
The size of the CTU is configured as one of 16×16, 32×32 or 64×64 luma samples.