HEVC编码优化：二进制与多类学习的低复杂度策略

174 浏览量更新于2024-07-15 收藏 3.57MB PDF 举报

"这篇研究论文探讨了基于二进制和多类学习的HEVC编码的低复杂度优化技术，旨在在保持高效视频编码（HEVC）的压缩效率的同时，降低其计算复杂性。" 高效率视频编码（HEVC）是目前广泛应用的视频编码标准，它通过采用四叉树编码单元（CU）结构和可变预测单元（PU）模式显著提升了压缩效率。然而，这种提升是以增加计算复杂性为代价的。针对这一问题，本文提出了一种基于二进制和多类支持向量机（SVM）的快速HEVC编码算法。首先，文章将HEVC中的递归CU决策和PU选择过程建模为分层的二元分类和多元分类结构。二元分类对应于CU的分割决策，而多元分类则用于PU模式的选择。这两种分类结构分别反映了编码过程中关键的决策步骤。接着，利用这两种分类结构，文章通过二元和多元SVM对CU决策和PU选择进行优化。具体来说，通过训练好的分类器，可以预测CU是否需要分割以及PU应选择的模式，从而避免了传统的、计算密集型的率失真（RD）成本计算。这种方法显著减少了计算需求，降低了编码时间，同时尽可能地保持了编码质量。支持向量机（SVM）是一种监督学习模型，特别适用于小样本和非线性分类问题。在本研究中，SVM被训练以识别哪些CU和PU模式在特定场景下最有可能出现，从而提前做出决策，减少不必要的计算。二元SVM用于二分类问题，而多类SVM则处理具有多个可能类别的问题，这对于HEVC的CU和PU决策非常适用。此外，为了验证所提出的算法的有效性，论文可能会包含实验部分，展示与传统HEVC编码器的性能比较，包括编码速度的提升和编码效率的损失。通常，这些实验会使用标准测试序列，并通过客观和主观质量评估方法来衡量结果。总结起来，这篇研究论文为HEVC编码提供了新的优化策略，通过二进制和多类学习的方法，有效地降低了编码的复杂度，同时保持了编码效率。这种方法对于实时视频编码和资源有限的设备尤其有价值，因为它能够平衡编码质量和计算资源的消耗。

ZHU et al.: BINARY AND MULTI-CLASS LEARNING BASED LOW COMPLEXITY OPTIMIZATION FOR HEVC ENCODING 549

Fig. 2. Example of the CUs and PUs with the minimum RD cost. (a) CUs and PUs with the minimum RD cost in “BasketballPass” sequence. (b) PUs with

the minimum RD cost. (c) CUs with the minimum RD cost. (d) Pruned quad-tree structure of (c).

TABLE I

PPER BOUND OF POTENTIAL TIME SAVI NG (TS) RATIO UNDER

DIFFERENT QUANTIZATION PARAMETERS (QPS)[UNIT:%](TRUE

CUS/PUS CHECKING VS ALL CANDIDATE CUS/PUS CHECKING)

complexity. The proposed learning based fast HEVC encod-

ing algorithm is presented in Section III. Section IV provides

the optimal parameters determination for ﬂexible complexity

allocation. Experimental results and analysis are discussed in

Section V. Finally, conclusions are drawn in Section VI.

II. M

OTIVATION AND STATISTICAL ANALYSIS

In HEVC, the recursive CU decision and multiple PUs

selection are adopted, as shown in Fig. 1. The ﬂexible

CUs can be recursively selected from 64×64 to 8×8, i.e.,

noted as Depth 0 to Depth 3. Fig. 1(a) shows the pro-

cess of recursive splitting, namely, the current CU can be

split into four sub-CUs, and then every sub-CU can also

be further split until reaches the smallest size of 8×8.

This recursive CU decision can be represented as a quad-

tree, as shown in Fig. 1(b). After checking all CU can-

didates, i.e., each node of the quad-tree, the CU or CU

combinations will be selected with the minimum RD cost.

Additionally, PU selection is implemented for each CU, in

which the PU with the minimum RD cost will be selected

from 11 mode candidates, i.e., SKIP/Merge, Inter_2N×2N,

Inter_N×2N, Inter_N×N, Asymmetric Motion Partitions

(AMP, including Inter_2N×nU, Inter_2N×nD, Inter_nL×2N,

and Inter_nR×2N), Intra_2N×2N and Intra_N×N, as shown

in Fig. 1(c). In the recursive CU decision and PU selection

process, HEVC checks dozens of mode candidates one by one

then selects the best one with the minimum RD cost, which

is extremely time-consuming [2].

Fig. 2(a) shows an example of the CUs and PUs with the

minimum RD cost in BasketballPass (416 × 240) sequence.

One CU of 64×64 is marked as red boundary, and its PU

and CU partitions with the minimum RD cost are shown in

Fig. 2(b) and Fig. 2(c). The pruned quad-tree structure of

Fig. 2(c) is also shown in Fig. 2(d). Compared with the full

candidates in Fig. 1, only several CUs and PUs are selected

ultimately. Therefore, we consider that if these CUs and PUs

with the minimum RD cost can be precisely predicted without

full RD cost calculation and comparison, the computational

complexity would be signiﬁcantly reduced. Then, we conduct

statistical experiments to explore the complexity redundancy

in HEVC. The experimental procedures are as follows. Firstly,

the ground truths (the CUs and PUs with the minimum RD

cost) are recorded when encoding the video with the HEVC

test model. Secondly, the sequences are encoded again with

ground truths. Thirdly, the encoding complexity is compared

with that of the HEVC test model. Table I shows the upper

bound of potential Time Saving (TS). CU level indicates that

CUs are predicted directly instead of RD cost calculation and

comparison. PU level represents that the PUs will be predicted

directly without checking all the candidates. CU + PU means

that CU and PU levels are both activated. In Table I, it can be

found that if these CUs and PUs are precisely predicted with-

out RD cost calculation, the average upper bound of TS in each

level can reach 66.4%, 44.8% and 82.3%, respectively. There

is a great potential of complexity reduction by predicting the

CUs/PUs directly.

Thus, these CU decision and PU selection in video cod-

ing can be regarded as classiﬁcation problems: (1) In the CU

level, the recursive CU decision is modeled as a three-level

hierarchical binary classiﬁcation issue, i.e.,

whether the current

CU being further split or not will be determined by a binary

classiﬁer directly. (2) In the PU level, selecting the best PU

from multiple candidates is modeled as a multi-class classiﬁ-

cation issue, i.e., the best one can be predicted by a multi-class

classiﬁer directly without full candidates checking.

III. P

ROPOSED LEARNING BASED FAST

HEVC ENCODING ALGORITHM

In this section, we propose a fast CU decision and PU selec-

tion algorithm. Firstly, the decision structure for CU and PU

剩余14页未读，继续阅读

weixin_38547882

粉丝: 4

HEVC编码优化：二进制与多类学习的低复杂度策略

JCTVC-H1001-v1.zip_HEVC计算复杂度降低算法研究_hevc

HEVC Encoder Description 1

二进制算术编码的最佳概率估计模型

高效视频编码（HEVC）实现视频编码

具有自适应阈值的HEVC改进的速率失真优化量化

HEVC电子书

HEVC编码器关键技术研究与优化设计

HEVC编码框架详解：四叉树划分与预测机制

HEVC视频编码：快速帧内编码技术

机器学习驱动的高效率视频编码：灵活复杂度分配的编码单元深度决策

最新资源