新一代视频编码标准：HEVC与H.264的对比分析

需积分: 9 73 浏览量更新于2024-07-17 收藏 2.68MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"H.265-201304.pdf" 文件是关于新一代视频编码标准 High Efficiency Video Coding（HEVC），也被称为H.265。该标准由国际电信联盟(ITU-T)发布，旨在提高视频压缩效率，与前一代标准H.264相比，它能在保持相同视频质量的情况下，显著减少大约一半的编码数据量。这使得H.265在高清和超高清视频传输中更具优势，尤其适用于带宽有限或存储空间宝贵的环境。 HEVC 的主要知识点包括： 1. **编码效率提升**：HEVC 通过引入更先进的编码技术，如更精细的块划分结构、更复杂的熵编码以及多模式预测等，实现了更高的压缩效率。相比H.264，它能将比特率减半，但视频质量保持不变。 2. **块划分结构**：H.265 使用更小的编码单元，最小可达4x4像素大小，甚至支持更灵活的三角形划分，这使得编码器能更精确地处理视频中的复杂细节和运动区域。 3. **多模式预测**：HEVC 提供了更多的预测模式，包括不同方向的块预测、跨宏块预测等，以适应不同场景下的运动特性，进一步提升压缩效果。 4. **熵编码优化**：HEVC 使用了改进的上下文自适应二进制算术编码(CABAC)和上下文自适应变长编码(CAVLC)，提高了编码效率并降低了码流的复杂性。 5. **高性能的去块滤波器和残留差分编码**：HEVC 引入了更强大的去块效应滤波器，减少了压缩过程中产生的块效应。同时，对残留差分的编码也进行了优化，以更好地处理图像的高频信息。 6. **色彩空间处理**：HEVC 支持高动态范围(HDR)视频和宽色域(WCG)编码，提供了更丰富的色彩表现力，满足了现代显示设备的需求。 7. **多层编码**：HEVC 支持多层编码，允许生成不同比特率的视频流，方便不同网络条件下的流媒体服务。 8. **适应性编码**：HEVC 针对互联网传输的特点，增加了错误恢复和适应性编码机制，以应对网络抖动和丢包。 9. **增强型3D视频编码**：HEVC 还考虑了3D视频编码，提供了更高效的方式来处理立体视频，为虚拟现实(VR)和增强现实(AR)应用奠定了基础。 10. **应用场景**：HEVC 被广泛应用于4K和8K超高清电视、在线视频流媒体、移动设备视频播放、视频会议、监控摄像头等领域，极大地推动了高清视频的普及和传播。总结来说，H.265/HEVC 视频编码标准是视频编码技术的重要里程碑，通过一系列创新的编码策略，极大地提高了视频数据的压缩效率，从而满足了不断增长的高清视频需求，同时也对网络带宽和存储资源的节省产生了积极影响。

资源详情

资源推荐

2 Rec. ITU-T H.265 (04/2013)

This is the first version of this Specification. Additional versions are anticipated.

0.5 Profiles, tiers and levels

This Recommendation | International Standard is designed to be generic in the sense that it serves a wide range of

applications, bit rates, resolutions, qualities, and services. Applications should cover, among other things, digital storage

media, television broadcasting and real-time communications. In the course of creating this Specification, various

requirements from typical applications have been considered, necessary algorithmic elements have been developed, and

these have been integrated into a single syntax. Hence, this Specification will facilitate video data interchange among

different applications.

Considering the practicality of implementing the full syntax of this Specification, however, a limited number of subsets

of the syntax are also stipulated by means of "profiles", "tiers", and "levels". These and other related terms are formally

defined in clause 3.

A "profile" is a subset of the entire bitstream syntax that is specified in this Recommendation | International Standard.

Within the bounds imposed by the syntax of a given profile it is still possible to require a very large variation in the

performance of encoders and decoders depending upon the values taken by syntax elements in the bitstream such as the

specified size of the decoded pictures. In many applications, it is currently neither practical nor economic to implement

a decoder capable of dealing with all hypothetical uses of the syntax within a particular profile.

In order to deal with this problem, "tiers" and "levels" are specified within each profile. A level of a tier is a specified

set of constraints imposed on values of the syntax elements in the bitstream. These constraints may be simple limits on

values. Alternatively they may take the form of constraints on arithmetic combinations of values (e.g., picture width

multiplied by picture height multiplied by number of pictures decoded per second). A level specified for a lower tier is

more constrained than a level specified for a higher tier.

Coded video content conforming to this Recommendation | International Standard uses a common syntax. In order to

achieve a subset of the complete syntax, flags, parameters, and other syntax elements are included in the bitstream that

signal the presence or absence of syntactic elements that occur later in the bitstream.

0.6 Overview of the design characteristics

The coded representation specified in the syntax is designed to enable a high compression capability for a desired image

or video quality. The algorithm is typically not lossless, as the exact source sample values are typically not preserved

through the encoding and decoding processes. A number of techniques may be used to achieve highly efficient

compression. Encoding algorithms (not specified in this Recommendation | International Standard) may select between

inter and intra coding for block-shaped regions of each picture. Inter coding uses motion vectors for block-based inter

prediction to exploit temporal statistical dependencies between different pictures. Intra coding uses various spatial

prediction modes to exploit spatial statistical dependencies in the source signal for a single picture. Motion vectors and

intra prediction modes may be specified for a variety of block sizes in the picture. The prediction residual may then be

further compressed using a transform to remove spatial correlation inside the transform block before it is quantized,

producing a possibly irreversible process that typically discards less important visual information while forming a close

approximation to the source samples. Finally, the motion vectors or intra prediction modes may also be further

compressed using a variety of prediction mechanisms, and, after prediction, are combined with the quantized transform

coefficient information and encoded using arithmetic coding.

0.7 How to read this Specification

It is suggested that the reader starts with clause 1 (Scope) and moves on to clause 3 (Definitions). Clause 6 should be

read for the geometrical relationship of the source, input, and output of the decoder. Clause 7 (Syntax and semantics)

specifies the order to parse syntax elements from the bitstream. See clauses 7.1–7.3 for syntactical order and see

clause 7.4 for semantics; e.g., the scope, restrictions, and conditions that are imposed on the syntax elements. The actual

parsing for most syntax elements is specified in clause 9 (Parsing process). Clause 10 (Sub-bitstream extraction process)

specifies the sub-bitstream extraction process. Finally, clause 8 (Decoding process) specifies how the syntax elements

are mapped into decoded samples. Throughout reading this Specification, the reader should refer to clauses 2

(Normative references), 4 (Abbreviations), and 5 (Conventions) as needed. Annexes A through E also form an integral

part of this Recommendation | International Standard.

Annex A specifies profiles each being tailored to certain application domains, and defines the so-called tiers and levels

of the profiles. Annex B specifies syntax and semantics of a byte stream format for delivery of coded video as an

ordered stream of bytes. Annex C specifies the hypothetical reference decoder, bitstream conformance, decoder

conformance, and the use of the hypothetical reference decoder to check bitstream and decoder conformance. Annex D

specifies syntax and semantics for supplemental enhancement information message payloads. Annex E specifies syntax

and semantics of the video usability information parameters of the sequence parameter set.

Rec. ITU-T H.265 (04/2013) 3

Throughout this Specification, statements appearing with the preamble "NOTE –" are informative and are not an

integral part of this Recommendation | International Standard.

1 Scope

This Recommendation | International Standard specifies high efficiency video coding.

2 Normative references

2.1 General

The following Recommendations and International Standards contain provisions which, through reference in this text,

constitute provisions of this Recommendation | International Standard. At the time of publication, the editions indicated

were valid. All Recommendations and Standards are subject to revision, and parties to agreements based on this

Recommendation | International Standard are encouraged to investigate the possibility of applying the most recent

edition of the Recommendations and Standards listed below. Members of IEC and ISO maintain registers of currently

valid International Standards. The Telecommunication Standardization Bureau of the ITU maintains a list of currently

valid ITU-T Recommendations.

2.2 Identical Recommendations | International Standards

– None

2.3 Paired Recommendations | International Standards equivalent in technical content

– None

2.4 Additional references

– Recommendation ITU-T T.35 (in force), Procedure for the allocation of ITU-T defined codes for

non-standard facilities.

– ISO/IEC 11578: in force, Information technology — Open Systems Interconnection — Remote Procedure

Call (RPC).

– ISO 11664-1: in force, Colorimetry — Part 1: CIE standard colorimetric observers.

– ISO 12232: in force, Photography – Digital still cameras – Determination of exposure index, ISO speed

ratings, standard output sensitivity, and recommended exposure index.

– IETF RFC 1321 (in force), The MD5 Message-Digest Algorithm.

3 Definitions

For the purposes of this Recommendation | International Standard, the following definitions apply:

3.1 access unit: A set of NAL units that are associated with each other according to a specified classification rule,

are consecutive in decoding order, and contain exactly one coded picture.

NOTE – In addition to containing the VCL NAL units of the coded picture, an access unit may also contain non-

VCL NAL units. The decoding of an access unit always results in a decoded picture.

3.2 AC transform coefficient: Any transform coefficient for which the frequency index in at least one of the two

dimensions is non-zero.

3.3 associated non-VCL NAL unit: A non-VCL NAL unit (when present) for a VCL NAL unit where the VCL

NAL unit is the associated VCL NAL unit of the non-VCL NAL unit.

3.4 associated IRAP picture: The previous IRAP picture in decoding order (when present).

3.5 associated VCL NAL unit: The preceding VCL NAL unit in decoding order for a non-VCL NAL unit with

nal_unit_type equal to EOS_NUT, EOB_NUT, FD_NUT, or SUFFIX_SEI_NUT, or in the ranges of

RSV_NVCL45..RSV_NVCL47 or UNSPEC56..UNSPEC63; or otherwise the next VCL NAL unit in decoding

order.

3.6 bin: One bit of a bin string.

4 Rec. ITU-T H.265 (04/2013)

3.7 binarization: A set of bin strings for all possible values of a syntax element.

3.8 binarization process: A unique mapping process of all possible values of a syntax element onto a set of bin

strings.

3.9 bin string: An intermediate binary representation of values of syntax elements from the binarization of the

syntax element.

3.10 bi-predictive (B) slice: A slice that may be decoded using intra prediction or inter prediction using at most

two motion vectors and reference indices to predict the sample values of each block.

3.11 bitstream: A sequence of bits, in the form of a NAL unit stream or a byte stream, that forms the

representation of coded pictures and associated data forming one or more CVSs.

3.12 block: An MxN (M-column by N-row) array of samples, or an MxN array of transform coefficients.

3.13 broken link: A location in a bitstream at which it is indicated that some subsequent pictures in decoding

order may contain serious visual artefacts due to unspecified operations performed in the generation of the

bitstream.

3.14 broken link access (BLA) access unit: An access unit in which the coded picture is a BLA picture.

3.15 broken link access (BLA) picture: An IRAP picture for which each VCL NAL unit has nal_unit_type equal

to BLA_W_LP, BLA_W_RADL, or BLA_N_LP.

NOTE – A BLA picture contains only I slices, and may be the first picture in the bitstream in decoding order, or

may appear later in the bitstream. Each BLA picture begins a new CVS, and has the same effect on the decoding

process as an IDR picture. However, a BLA picture contains syntax elements that specify a non-empty RPS. When a

BLA picture for which each VCL NAL unit has nal_unit_type equal to BLA_W_LP, it may have associated RASL

pictures, which are not output by the decoder and may not be decodable, as they may contain references to pictures

that are not present in the bitstream. When a BLA picture for which each VCL NAL unit has nal_unit_type equal to

BLA_W_LP, it may also have associated RADL pictures, which are specified to be decoded. When a BLA picture

for which each VCL NAL unit has nal_unit_type equal to BLA_W_RADL, it does not have associated RASL

pictures but may have associated RADL pictures. When a BLA picture for which each VCL NAL unit has

nal_unit_type equal to BLA_N_LP, it does not have any associated leading pictures.

3.16 buffering period: The set of access units starting with an access unit that contains a buffering period SEI

message and containing all subsequent access units in decoding order up to but not including the next access

unit (when present) that contains a buffering period SEI message.

3.17 byte: A sequence of 8 bits, within which, when written or read as a sequence of bit values, the left-most and

right-most bits represent the most and least significant bits, respectively.

3.18 byte-aligned: A position in a bitstream is byte-aligned when the position is an integer multiple of 8 bits from

the position of the first bit in the bitstream, and a bit or byte or syntax element is said to be byte-aligned when

the position at which it appears in a bitstream is byte-aligned.

3.19 byte stream: An encapsulation of a NAL unit stream containing start code prefixes and NAL units as specified

in Annex B.

3.20 can: A term used to refer to behaviour that is allowed, but not necessarily required.

3.21 chroma: An adjective, represented by the symbols Cb and Cr, specifying that a sample array or single sample

is representing one of the two colour difference signals related to the primary colours.

NOTE – The term chroma is used rather than the term chrominance in order to avoid the implication of the use of

linear light transfer characteristics that is often associated with the term chrominance.

3.22 clean random access (CRA) access unit: An access unit in which the coded picture is a CRA picture.

3.23 clean random access (CRA) picture: An IRAP picture for which each VCL NAL unit has nal_unit_type

equal to CRA_NUT.

NOTE – A CRA picture contains only I slices, and may be the first picture in the bitstream in decoding order, or

may appear later in the bitstream. A CRA picture may have associated RADL or RASL pictures. When a CRA

picture has NoRaslOutputFlag equal to 1, the associated RASL pictures are not output by the decoder, because they

may not be decodable, as they may contain references to pictures that are not present in the bitstream.

3.24 coded picture: A coded representation of a picture containing all coding tree units of the picture.

3.25 coded picture buffer (CPB): A first-in first-out buffer containing decoding units in decoding order specified

in the hypothetical reference decoder in Annex C.

3.26 coded representation: A data element as represented in its coded form.

Rec. ITU-T H.265 (04/2013) 5

3.27 coded slice segment NAL unit: A NAL unit that has nal_unit_type in the range of TRAIL_N to RASL_R,

inclusive, or in the range of BLA_W_LP to RSV_IRAP_VCL23, inclusive, which indicates that the NAL unit

contains a coded slice segment.

3.28 coded video sequence (CVS): A sequence of access units that consists, in decoding order, of an IRAP access

unit with NoRaslOutputFlag equal to 1, followed by zero or more access units that are not IRAP access units

with NoRaslOutputFlag equal to 1, including all subsequent access units up to but not including any

subsequent access unit that is an IRAP access unit with NoRaslOutputFlag equal to 1.

NOTE – An IRAP access unit may be an IDR access unit, a BLA access unit, or a CRA access unit. The value of

NoRaslOutputFlag is equal to 1 for each IDR access unit, each BLA access unit, and each CRA access unit that is

the first access unit in the bitstream in decoding order, is the first access unit that follows an end of sequence NAL

unit in decoding order, or has HandleCraAsBlaFlag equal to 1.

3.29 coding block: An NxN block of samples for some value of N such that the division of a coding tree block into

coding blocks is a partitioning.

3.30 coding tree block: An NxN block of samples for some value of N such that the division of a component into

coding tree blocks is a partitioning.

3.31 coding tree unit: A coding tree block of luma samples, two corresponding coding tree blocks of chroma

samples of a picture that has three sample arrays, or a coding tree block of samples of a monochrome picture

or a picture that is coded using three separate colour planes and syntax structures used to code the samples.

3.32 coding unit: A coding block of luma samples, two corresponding coding blocks of chroma samples of a

picture that has three sample arrays, or a coding block of samples of a monochrome picture or a picture that is

coded using three separate colour planes and syntax structures used to code the samples.

3.33 component: An array or single sample from one of the three arrays (luma and two chroma) that compose a

picture in 4:2:0, 4:2:2, or 4:4:4 colour format or the array or a single sample of the array that compose a

picture in monochrome format.

3.34 context variable: A variable specified for the adaptive binary arithmetic decoding process of a bin by an

equation containing recently decoded bins.

3.35

cropped decoded picture: The result of cropping a decoded picture based on the conformance cropping

window specified in the SPS that is referred to by the corresponding coded picture.

3.36 decoded picture: A decoded picture is derived by decoding a coded picture.

3.37 decoded picture buffer (DPB): A buffer holding decoded pictures for reference, output reordering, or output

delay specified for the hypothetical reference decoder in Annex C.

3.38 decoder: An embodiment of a decoding process.

3.39 decoder under test (DUT): A decoder that is tested for conformance to this Specification by operating the

hypothetical stream scheduler to deliver a conforming bitstream to the decoder and to the hypothetical

reference decoder and comparing the values and timing or order of the output of the two decoders.

3.40 decoding order: The order in which syntax elements are processed by the decoding process.

3.41 decoding process: The process specified in this Specification that reads a bitstream and derives decoded

pictures from it.

3.42 decoding unit: An access unit if SubPicHrdFlag is equal to 0 or a subset of an access unit otherwise,

consisting of one or more VCL NAL units in an access unit and the associated non-VCL NAL units.

3.43 dependent slice segment: A slice segment for which the values of some syntax elements of the slice segment

header are inferred from the values for the preceding independent slice segment in decoding order.

3.44 display process: A process not specified in this Specification having, as its input, the

cropped decoded

pictures that are the output of the decoding process.

3.45 elementary stream: A sequence of one or more bitstreams.

NOTE – An elementary stream that consists of two or more bitstreams would typically have been formed by splicing

together two or more bitstreams (or parts thereof).

3.46 emulation prevention byte: A byte equal to 0x03 that is present within a NAL unit when the syntax elements

of the bitstream form certain patterns of byte values in a manner that ensures that no sequence of consecutive

byte-aligned bytes in the NAL unit can contain a start code prefix.

3.47 encoder: An embodiment of an encoding process.

6 Rec. ITU-T H.265 (04/2013)

3.48 encoding process: A process not specified in this Specification that produces a bitstream conforming to this

Specification.

3.49 field: An assembly of alternative rows of samples of a frame.

3.50 filler data NAL units: NAL units with nal_unit_type equal to FD_NUT.

3.51 flag: A variable that can take one of the two possible values 0 and 1.

3.52 frame: The composition of a top field and a bottom field, where sample rows 0, 2, 4, ... originate from the top

field and sample rows 1, 3, 5, ... originate from the bottom field.

3.53 frequency index: A one-dimensional or two-dimensional index associated with a transform coefficient prior

to an inverse transform part of the decoding process.

3.54 hypothetical reference decoder (HRD): A hypothetical decoder model that specifies constraints on the

variability of conforming NAL unit streams or conforming byte streams that an encoding process may

produce.

3.55 hypothetical stream scheduler (HSS): A hypothetical delivery mechanism used for checking the

conformance of a bitstream or a decoder with regards to the timing and data flow of the input of a bitstream

into the hypothetical reference decoder.

3.56 independent slice segment: A slice segment for which the values of the syntax elements of the slice segment

header are not inferred from the values for a preceding slice segment.

3.57 informative: A term used to refer to content provided in this Specification that does not establish any

mandatory requirements for conformance to this Specification and thus is not considered an integral part of

this Specification.

3.58 instantaneous decoding refresh (IDR) access unit: An access unit in which the coded picture is an

IDR

picture.

3.59 instantaneous decoding refresh (IDR) picture: An IRAP picture for which each VCL NAL unit has

nal_unit_type equal to IDR_W_RADL or IDR_N_LP.

NOTE – An IDR picture contains only I slices, and may be the first picture in the bitstream in decoding order, or

may appear later in the bitstream. Each IDR picture is the first picture of a CVS in decoding order. When an IDR

picture for which each VCL NAL unit has nal_unit_type equal to IDR_W_RADL, it may have associated RADL

pictures. When an IDR picture for which each VCL NAL unit has nal_unit_type equal to IDR_N_LP, it does not

have any associated leading pictures. An IDR picture does not have associated RASL pictures.

3.60 inter coding: Coding of a coding block, slice, or picture that uses inter prediction.

3.61 inter prediction: A prediction derived in a manner that is dependent on data elements (e.g., sample values or

motion vectors) of pictures other than the current picture.

3.62 intra coding: Coding of a coding block, slice, or picture that uses intra prediction.

3.63 intra prediction: A prediction derived from only data elements (e.g., sample values) of the same decoded

slice.

3.64 intra random access point (IRAP) access unit: An access unit in which the coded picture is an IRAP

picture.

3.65 intra random access point (IRAP) picture: A coded picture for which each VCL NAL unit has nal_unit_type

in the range of BLA_W_LP to RSV_IRAP_VCL23, inclusive.

NOTE – An IRAP picture contains only I slices, and may be a BLA picture, a CRA picture or an IDR picture. The

first picture in the bitstream in decoding order must be an IRAP picture. Provided the necessary parameter sets are

available when they need to be activated, the IRAP picture and all subsequent non-RASL pictures in decoding order

can be correctly decoded without performing the decoding process of any pictures that precede the IRAP picture in

decoding order. There may be pictures in a bitstream that contain only I slices that are not IRAP pictures.

3.66 intra (I) slice: A slice that is decoded using intra prediction only.

3.67 inverse transform: A part of the decoding process by which a set of transform coefficients are converted into

spatial-domain values.

3.68 layer: A set of VCL NAL units that all have a particular value of nuh_layer_id and the associated non-VCL

NAL units, or one of a set of syntactical structures having a hierarchical relationship.

NOTE – Depending on the context, either the first layer concept or the second layer concept applies. The first layer

concept is also referred to as a scalable layer, wherein a layer may be a spatial scalable layer, a quality scalable

layer, a view, etc. A temporal true subset of a scalable layer is not referred to as a layer but referred to as a sub-layer

剩余316页未读，继续阅读

zhangxunhuafx

粉丝: 0
资源: 9

新一代视频编码标准：HEVC与H.264的对比分析

h265官方文档__T-REC-H.265-201504-I!!PDF-E.pdf

T-REC-H.265-201906-PDF.zip

T-REC-H.265-201911.pdf

T-REC-H.265-201612-I!!PDF-E

T-REC-H.266-202008-I!!PDF-E.pdf

H.264-200503-简体中文版.pdf

H.264-AVC-ISO_IEC_14496-15.pdf

G. H. Hardy - Inequalities

H.264.And.MPEG-4.Video.Compression中文版.PDF完整版.rar

(java)aspose实现word转pdf，排版好无水印.aspose-words-14.9.0-jdk16.jar

新一代视频压缩编码标准-H.264-AVC 毕厚杰

fr.opensagres.poi.xwpf.converter.pdf-2.0.2.jar

org.apache.poi.xwpf.converter.pdf-1.0.6.jar

spire.pdf.free-2.2.2.jar

spire.pdf.free-4.4.1.jar

数学逻辑学习-1找规律.pdf-17页.pdf

易观-中国智能支付终端专题分析2019-2019.3.26-35页.pdf.pdf

【图像配准】基于matlab双目视觉图像匹配（含视差图 3D视图）【含Matlab源码 4601期】.md

最新资源