没有合适的资源?快使用搜索试试~ 我知道了~
首页H.265/HEVC标准详解:新一代视频压缩技术
"H.265/HEVC标准白皮书(2013年1月)"
H.265/High Efficiency Video Coding (HEVC) 是一种先进的视频编码标准,旨在大幅度提高视频压缩效率,相较于前一代的H.264/AVC标准,它能以相同的视频质量下减少大约50%的带宽需求。这个标准由国际电信联盟ITU-T的SG16 WP3工作组和国际标准化组织ISO/IEC JTC1/SC29/WG11联合制定,于2013年发布。
白皮书的目录涵盖了该标准的多个关键方面:
1. **概述**:包括标准的背景、目的、应用范围以及版本信息,还讨论了不同的配置文件、等级和应用场景。
2. **范围**:定义了标准所涵盖的技术领域。
3. **规范性参考**:列出了相关的重要推荐标准和国际标准,以及等同技术内容的标准。
4. **定义**:提供了标准中使用的术语和定义。
5. **缩略词**:列举了标准中出现的缩写词。
6. **约定**:详细说明了在标准中使用的数学运算、逻辑运算、关系运算、位运算、赋值运算、范围表示、数学函数、操作顺序优先级以及变量、语法元素和表格的描述方式。
7. **位流和图像格式、分区、扫描过程及相邻关系**:详细阐述了位流的格式、源图像、解码图像和输出图像的格式,以及图像、切片、切片段、瓷砖、编码树单元和编码树块的分区方法。还包括了各种扫描过程的定义。
8. **语法和语义**:这部分是标准的核心,定义了如何以表格形式表示语法,语法函数和描述符的规范,以及NAL单元语法、原始字节序列负载、尾部位和字节对齐的语法。此外,还详细列出了不同类型的RBSP(重建字节流payload)语法,如视频参数集、序列参数集、图片参数集等。同时,还涵盖了诸如参考图片列表修改、加权预测参数、短期参考图片集、切片段数据等的语法结构,以及每个结构的语义解释。
HEVC标准通过更复杂的编码技术,如更精细的块划分、多模式运动估计、预测技术的增强、变换和熵编码的改进,以及更高效的熵编码方法,实现了更高的压缩效率。这些技术的结合使得HEVC成为高清和超高清视频传输的理想选择,尤其在有限带宽的网络环境中。随着4K、8K视频的普及,H.265/HEVC标准的重要性日益凸显。
ISO/IEC 23008-2 : 201x (E)
4 Draft Rec. ITU-T H.HEVC (201x E)
3.6 bin: One bit of a bin string.
3.7 binarization: A set of bin strings for all possible values of a syntax element.
3.8 binarization process: A unique mapping process of all possible values of a syntax element onto a set of bin
strings.
3.9 bin string: An intermediate binary representation of values of syntax elements from the binarization of the
syntax element.
3.10 bi-predictive (B) slice: A slice that may be decoded using intra prediction or inter prediction using at most
two motion vectors and reference indices to predict the sample values of each block.
3.11 bitstream: A sequence of bits, in the form of a NAL unit stream or a byte stream, that forms the representation
of coded pictures and associated data forming one or more CVSs.
3.12 block: An MxN (M-column by N-row) array of samples, or an MxN array of transform coefficients.
3.13 broken link: A location in a bitstream at which it is indicated that some subsequent pictures in decoding order
may contain serious visual artefacts due to unspecified operations performed in the generation of the bitstream.
3.14 broken link access (BLA) access unit: An access unit in which the coded picture is a BLA picture.
3.15 broken link access (BLA) picture: An IRAP picture for which each VCL NAL unit has nal_unit_type equal to
BLA_W_LP, BLA_W_RADL, or BLA_N_LP.
NOTE 2 – A BLA picture contains only I slices, and may be the first picture in the bitstream in decoding order, or
may appear later in the bitstream. Each BLA picture begins a new CVS, and has the same effect on the decoding
process as an IDR picture. However, a BLA picture contains syntax elements that specify a non-empty RPS. When a
BLA picture for which each VCL NAL unit has nal_unit_type equal to BLA_W_LP, it may have associated RASL
pictures, which are not output by the decoder and may not be decodable, as they may contain references to pictures
that are not present in the bitstream. When a BLA picture for which each VCL NAL unit has nal_unit_type equal to
BLA_W_LP, it may also have associated RADL pictures, which are specified to be decoded. When a BLA picture for
which each VCL NAL unit has nal_unit_type equal to BLA_W_RADL, it does not have associated RASL pictures
but may have associated RADL pictures. When a BLA picture for which each VCL NAL unit has nal_unit_type equal
to BLA_N_LP, it does not have any associated leading pictures.
3.16 buffering period: The set of access units starting with an access unit that contains a buffering period SEI
message and containing all subsequent access units in decoding order up to but not including the next access
unit (when present) that contains a buffering period SEI message.
3.17 byte: A sequence of 8 bits, within which, when written or read as a sequence of bit values, the left-most and
right-most bits represent the most and least significant bits, respectively.
3.18 byte-aligned: A position in a bitstream is byte-aligned when the position is an integer multiple of 8 bits from
the position of the first bit in the bitstream, and a bit or byte or syntax element is said to be byte-aligned when
the position at which it appears in a bitstream is byte-aligned.
3.19 byte stream: An encapsulation of a NAL unit stream containing start code prefixes and NAL units as specified
in Annex B.
3.20 can: A term used to refer to behaviour that is allowed, but not necessarily required.
3.21 chroma: An adjective, represented by the symbols Cb and Cr, specifying that a sample array or single sample
is representing one of the two colour difference signals related to the primary colours.
NOTE 3 – The term chroma is used rather than the term chrominance in order to avoid the implication of the use of
linear light transfer characteristics that is often associated with the term chrominance.
3.22 clean random access (CRA) access unit: An access unit in which the coded picture is a CRA picture.
3.23 clean random access (CRA) picture: An IRAP picture for which each VCL NAL unit has nal_unit_type equal
to CRA_NUT.
NOTE 4 – A CRA picture contains only I slices, and may be the first picture in the bitstream in decoding order, or
may appear later in the bitstream. A CRA picture may have associated RADL or RASL pictures. When a CRA
picture has NoRaslOutputFlag equal to 1, the associated RASL pictures are not output by the decoder, because they
may not be decodable, as they may contain references to pictures that are not present in the bitstream.
3.24 coded picture: A coded representation of a picture containing all coding tree units of the picture.
3.25 coded picture buffer (CPB): A first-in first-out buffer containing decoding units in decoding order specified
in the hypothetical reference decoder in Annex C.
3.26 coded representation: A data element as represented in its coded form.
ISO/IEC 23008-2 : 201x (E)
Draft Rec. ITU-T H.HEVC (201x E) 5
3.27 coded slice segment NAL unit: A NAL unit that has nal_unit_type in the range of TRAIL_N to RASL_R,
inclusive, or in the range of BLA_W_LP to RSV_IRAP_VCL23, inclusive, which indicates that the NAL unit
contains a coded slice segment.
3.28 coded video sequence (CVS): A sequence of access units that consists, in decoding order, of an IRAP access
unit with NoRaslOutputFlag equal to 1, followed by zero or more access units that are not IRAP access units
with NoRaslOutputFlag equal to 1, including all subsequent access units up to but not including any
subsequent access unit that is an IRAP access unit with NoRaslOutputFlag equal to 1.
NOTE 5 – An IRAP access unit may be an IDR access unit, a BLA access unit, or a CRA access unit. The value of
NoRaslOutputFlag is equal to 1 for each IDR access unit, each BLA access unit, and each CRA access unit that is the
first access unit in the bitstream in decoding order, is the first access unit that follows an end of sequence NAL unit in
decoding order, or has HandleCraAsBlaFlag equal to 1.
3.29 coding block: An NxN block of samples for some value of N such that the division of a coding tree block into
coding blocks is a partitioning.
3.30 coding tree block: An NxN block of samples for some value of N such that the division of a component into
coding tree blocks is a partitioning.
3.31 coding tree unit: A coding tree block of luma samples, two corresponding coding tree blocks of chroma
samples of a picture that has three sample arrays, or a coding tree block of samples of a monochrome picture
or a picture that is coded using three separate colour planes and syntax structures used to code the samples.
3.32 coding unit: A coding block of luma samples, two corresponding coding blocks of chroma samples of a
picture that has three sample arrays, or a coding block of samples of a monochrome picture or a picture that is
coded using three separate colour planes and syntax structures used to code the samples.
3.33 component: An array or single sample from one of the three arrays (luma and two chroma) that compose a
picture in 4:2:0, 4:2:2, or 4:4:4 colour format or the array or a single sample of the array that compose a picture
in monochrome format.
3.34 context variable: A variable specified for the adaptive binary arithmetic decoding process of a bin by an
equation containing recently decoded bins.
3.35 cropped decoded picture: The result of cropping a decoded picture based on the conformance cropping
window specified in the SPS that is referred to by the corresponding coded picture.
3.36 decoded picture: A decoded picture is derived by decoding a coded picture.
3.37 decoded picture buffer (DPB): A buffer holding decoded pictures for reference, output reordering, or output
delay specified for the hypothetical reference decoder in Annex C.
3.38 decoder: An embodiment of a decoding process.
3.39 decoder under test (DUT): A decoder that is tested for conformance to this Specification by operating the
hypothetical stream scheduler to deliver a conforming bitstream to the decoder and to the hypothetical
reference decoder and comparing the values and timing or order of the output of the two decoders.
3.40 decoding order: The order in which syntax elements are processed by the decoding process.
3.41 decoding process: The process specified in this Specification that reads a bitstream and derives decoded
pictures from it.
3.42 decoding unit: An access unit if SubPicHrdFlag is equal to 0 or a subset of an access unit otherwise,
consisting of one or more VCL NAL units in an access unit and the associated non-VCL NAL units.
3.43 dependent slice segment: A slice segment for which the values of some syntax elements of the slice segment
header are inferred from the values for the preceding independent slice segment in decoding order.
3.44 display process: A process not specified in this Specification having, as its input, the cropped decoded
pictures that are the output of the decoding process.
3.45 elementary stream: A sequence of one or more bitstreams.
NOTE 6 – An elementary stream that consists of two or more bitstreams would typically have been formed by
splicing together two or more bitstreams (or parts thereof).
3.46 emulation prevention byte: A byte equal to 0x03 that is present within a NAL unit when the syntax elements
of the bitstream form certain patterns of byte values in a manner that ensures that no sequence of consecutive
byte-aligned bytes in the NAL unit can contain a start code prefix.
3.47 encoder: An embodiment of an encoding process.
ISO/IEC 23008-2 : 201x (E)
6 Draft Rec. ITU-T H.HEVC (201x E)
3.48 encoding process: A process not specified in this Specification that produces a bitstream conforming to this
Specification.
3.49 field: An assembly of alternative rows of samples of a frame.
3.50 filler data NAL units: NAL units with nal_unit_type equal to FD_NUT.
3.51 flag: A variable that can take one of the two possible values 0 and 1.
3.52 frame: The composition of a top field and a bottom field, where sample rows 0, 2, 4, ... originate from the top
field and sample rows 1, 3, 5, ... originate from the bottom field.
3.53 frequency index: A one-dimensional or two-dimensional index associated with a transform coefficient prior to
an inverse transform part of the decoding process.
3.54 hypothetical reference decoder (HRD): A hypothetical decoder model that specifies constraints on the
variability of conforming NAL unit streams or conforming byte streams that an encoding process may produce.
3.55 hypothetical stream scheduler (HSS): A hypothetical delivery mechanism used for checking the
conformance of a bitstream or a decoder with regards to the timing and data flow of the input of a bitstream
into the hypothetical reference decoder.
3.56 independent slice segment: A slice segment for which the values of the syntax elements of the slice segment
header are not inferred from the values for a preceding slice segment.
3.57 informative: A term used to refer to content provided in this Specification that does not establish any
mandatory requirements for conformance to this Specification and thus is not considered an integral part of this
Specification.
3.58 instantaneous decoding refresh (IDR) access unit: An access unit in which the coded picture is an IDR
picture.
3.59 instantaneous decoding refresh (IDR) picture: An IRAP picture for which each VCL NAL unit has
nal_unit_type equal to IDR_W_RADL or IDR_N_LP.
NOTE 7 – An IDR picture contains only I slices, and may be the first picture in the bitstream in decoding order, or
may appear later in the bitstream. Each IDR picture is the first picture of a CVS in decoding order. When an IDR
picture for which each VCL NAL unit has nal_unit_type equal to IDR_W_RADL, it may have associated RADL
pictures. When an IDR picture for which each VCL NAL unit has nal_unit_type equal to IDR_N_LP, it does not have
any associated leading pictures. An IDR picture does not have associated RASL pictures.
3.60 inter coding: Coding of a coding block, slice, or picture that uses inter prediction.
3.61 inter prediction: A prediction derived in a manner that is dependent on data elements (e.g. sample values or
motion vectors) of pictures other than the current picture.
3.62 intra coding: Coding of a coding block, slice, or picture that uses intra prediction.
3.63 intra prediction: A prediction derived from only data elements (e.g. sample values) of the same decoded slice.
3.64 intra random access point (IRAP) access unit: An access unit in which the coded picture is an IRAP picture.
3.65 intra random access point (IRAP) picture: A coded picture for which each VCL NAL unit has nal_unit_type
in the range of BLA_W_LP to RSV_IRAP_VCL23, inclusive.
NOTE 8 – An IRAP picture contains only I slices, and may be a BLA picture, a CRA picture or an IDR picture. The
first picture in the bitstream in decoding order must be an IRAP picture. Provided the necessary parameter sets are
available when they need to be activated, the IRAP picture and all subsequent non-RASL pictures in decoding order
can be correctly decoded without performing the decoding process of any pictures that precede the IRAP picture in
decoding order. There may be pictures in a bitstream that contain only I slices that are not IRAP pictures.
3.66 intra (I) slice: A slice that is decoded using intra prediction only.
3.67 inverse transform: A part of the decoding process by which a set of transform coefficients are converted into
spatial-domain values.
3.68 layer: A set of VCL NAL units that all have a particular value of nuh_layer_id and the associated non-VCL
NAL units, or one of a set of syntactical structures having a hierarchical relationship.
NOTE 9 – Depending on the context, either the first layer concept or the second layer concept applies. The first layer
concept is also referred to as a scalable layer, wherein a layer may be a spatial scalable layer, a quality scalable layer,
a view, etc. A temporal true subset of a scalable layer is not referred to as a layer but referred to as a sub-layer or
temporal sub-layer. The second layer concept is also referred to as a coding layer, wherein higher layers contain lower
layers, and the coding layers are the CVS, picture, slice, slice segment, and coding tree unit layers.
ISO/IEC 23008-2 : 201x (E)
Draft Rec. ITU-T H.HEVC (201x E) 7
3.69 layer identifier list: A list of nuh_layer_id values that is associated with a layer set or an operation point and
can be used as an input to the sub-bitstream extraction process.
3.70 layer set: A set of layers represented within a bitstream created from another bitstream by operation of the
sub-bitstream extraction process with the another bitstream, the target highest TemporalId equal to 6, and the
target layer identifier list equal to the layer identifier list associated with the layer set as inputs.
3.71 leading picture: A picture that precedes the associated IRAP picture in output order.
3.72 leaf: A terminating node of a tree that is a root node of a tree of depth 0.
3.73 level: A defined set of constraints on the values that may be taken by the syntax elements and variables of this
Specification, or the value of a transform coefficient prior to scaling.
NOTE 10 – The same set of levels is defined for all profiles, with most aspects of the definition of each level being in
common across different profiles. Individual implementations may, within the specified constraints, support a
different level for each supported profile.
3.74 list 0 (list 1) motion vector: A motion vector associated with a reference index pointing into reference picture
list 0 (list 1).
3.75 list 0 (list 1) prediction: Inter prediction of the content of a slice using a reference index pointing into
reference picture list 0 (list 1).
3.76 long-term reference picture: A picture that is marked as "used for long-term reference".
3.77 long-term reference picture set: The two RPS lists that may contain long-term reference pictures.
3.78 luma: An adjective, represented by the symbol or subscript Y or L, specifying that a sample array or single
sample is representing the monochrome signal related to the primary colours.
NOTE 11 – The term luma is used rather than the term luminance in order to avoid the implication of the use of linear
light transfer characteristics that is often associated with the term luminance. The symbol L is sometimes used instead
of the symbol Y to avoid confusion with the symbol y as used for vertical location.
3.79 may: A term that is used to refer to behaviour that is allowed, but not necessarily required.
NOTE 12 – In some places where the optional nature of the described behaviour is intended to be emphasized, the
phrase "may or may not" is used to provide emphasis.
3.80 motion vector: A two-dimensional vector used for inter prediction that provides an offset from the coordinates
in the decoded picture to the coordinates in a reference picture.
3.81 must: A term that is used in expressing an observation about a requirement or an implication of a requirement
that is specified elsewhere in this Specification (used exclusively in an informative context).
3.82 nested SEI message: An SEI message that is contained in a scalable nesting SEI message.
3.83 network abstraction layer (NAL) unit: A syntax structure containing an indication of the type of data to
follow and bytes containing that data in the form of an RBSP interspersed as necessary with emulation
prevention bytes.
3.84 network abstraction layer (NAL) unit stream: A sequence of NAL units.
3.85 non-nested SEI message: An SEI message that is not contained in a scalable nesting SEI message.
3.86 non-reference picture: A picture that is marked as "unused for reference".
NOTE 13 – A non-reference picture contains samples that cannot be used for inter prediction in the decoding process
of subsequent pictures in decoding order. In other words, once a picture is marked as "unused for reference", it can
never be marked back as "used for reference".
3.87 non-VCL NAL unit: A NAL unit that is not a VCL NAL unit.
3.88 note: A term that is used to prefix informative remarks (used exclusively in an informative context).
3.89 operation point: A bitstream created from another bitstream by operation of the sub-bitstream extraction
process with the another bitstream, a target highest TemporalId, and a target layer identifier list as inputs.
NOTE 14 – If the target highest TemporalId of an operation point is equal to the greatest value of TemporalId in the
layer set associated with the target layer identification list, the operation point is identical to the layer set. Otherwise it
is a subset of the layer set.
3.90 output order: The order in which the decoded pictures are output from the decoded picture buffer (for the
decoded pictures that are to be output from the decoded picture buffer).
ISO/IEC 23008-2 : 201x (E)
8 Draft Rec. ITU-T H.HEVC (201x E)
3.91 parameter: A syntax element of a VPS, SPS or PPS, or the second word of the defined term quantization
parameter.
3.92 partitioning: The division of a set into subsets such that each element of the set is in exactly one of the
subsets.
3.93 picture: An array of luma samples in monochrome format or an array of luma samples and two corresponding
arrays of chroma samples in 4:2:0, 4:2:2, and 4:4:4 colour format.
NOTE 15 – A picture may be either a frame or a field. However, in one CVS, either all pictures are frames or all
pictures are fields.
3.94 picture parameter set (PPS): A syntax structure containing syntax elements that apply to zero or more entire
coded pictures as determined by a syntax element found in each slice segment header.
3.95 picture order count: A variable that is associated with each picture, uniquely identifies the associated picture
among all pictures in the CVS, and, when the associated picture is to be output from the decoded picture buffer,
indicates the position of the associated picture in output order relative to the output order positions of the other
pictures in the same CVS that are to be output from the decoded picture buffer.
3.96 prediction: An embodiment of the prediction process.
3.97 prediction block: A rectangular MxN block of samples on which the same prediction is applied.
3.98 prediction process: The use of a predictor to provide an estimate of the data element (e.g. sample value or
motion vector) currently being decoded.
3.99 prediction unit: A prediction block of luma samples, two corresponding prediction blocks of chroma samples
of a picture that has three sample arrays, or a prediction block of samples of a monochrome picture or a picture
that is coded using three separate colour planes and syntax structures used to predict the prediction block
samples.
3.100 predictive (P) slice: A slice that may be decoded using intra prediction or inter prediction using at most one
motion vector and reference index to predict the sample values of each block.
3.101 predictor: A combination of specified values or previously decoded data elements (e.g. sample value or
motion vector) used in the decoding process of subsequent data elements.
3.102 prefix SEI message: An SEI message that is contained in a prefix SEI NAL unit.
3.103 prefix SEI NAL unit: An SEI NAL unit that has nal_unit_type equal to PREFIX_SEI_NUT.
3.104 profile: A specified subset of the syntax of this Specification.
3.105 quadtree: A tree in which a parent node can be split into four child nodes, each of which may become parent
node for another split into four child nodes.
3.106 quantization parameter: A variable used by the decoding process for scaling of transform coefficient levels.
3.107 random access: The act of starting the decoding process for a bitstream at a point other than the beginning of
the stream.
3.108 random access decodable leading (RADL) access unit: An access unit in which the coded picture is a RADL
picture.
3.109 random access decodable leading (RADL) picture: A coded picture for which each VCL NAL unit has
nal_unit_type equal to RADL_R or RADL_N.
NOTE 16 – All RADL pictures are leading pictures. RADL pictures are not used as reference pictures for the
decoding process of trailing pictures of the same associated IRAP picture. When present, all RADL pictures precede,
in decoding order, all trailing pictures of the same associated IRAP picture.
3.110 random access skipped leading (RASL) access unit: An access unit in which the coded picture is a RASL
picture.
3.111 random access skipped leading (RASL) picture: A coded picture for which each VCL NAL unit has
nal_unit_type equal to RASL_R or RASL_N.
NOTE 17 – All RASL pictures are leading pictures of an associated BLA or CRA picture. When the associated IRAP
picture has NoRaslOutputFlag equal to 1, the RASL picture is not output and may not be correctly decodable, as the
RASL picture may contain references to pictures that are not present in the bitstream. RASL pictures are not used as
reference pictures for the decoding process of non-RASL pictures. When present, all RASL pictures precede, in
decoding order, all trailing pictures of the same associated IRAP picture.
剩余309页未读,继续阅读
2018-06-23 上传
2020-06-03 上传
2013-05-30 上传
2021-04-08 上传
2018-12-24 上传
2012-08-19 上传
2013-03-13 上传
NewThinker_wei
- 粉丝: 518
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 火炬连体网络在MNIST的2D嵌入实现示例
- Angular插件增强Application Insights JavaScript SDK功能
- 实时三维重建:InfiniTAM的ros驱动应用
- Spring与Mybatis整合的配置与实践
- Vozy前端技术测试深入体验与模板参考
- React应用实现语音转文字功能介绍
- PHPMailer-6.6.4: PHP邮件收发类库的详细介绍
- Felineboard:为猫主人设计的交互式仪表板
- PGRFileManager:功能强大的开源Ajax文件管理器
- Pytest-Html定制测试报告与源代码封装教程
- Angular开发与部署指南:从创建到测试
- BASIC-BINARY-IPC系统:进程间通信的非阻塞接口
- LTK3D: Common Lisp中的基础3D图形实现
- Timer-Counter-Lister:官方源代码及更新发布
- Galaxia REST API:面向地球问题的解决方案
- Node.js模块:随机动物实例教程与源码解析
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功