HEVC转码CU拆分预测：基于LSTM的方法

10 浏览量更新于2024-08-26 收藏 689KB PDF 举报

"该文提出了一种用于预测H.264到HEVC转码过程中编码单元（CU）拆分的层次长短期记忆网络（LSTM）方法。通过对H.264特性与CU分割模式之间的相关性进行分析，设计出层次LSTM架构，利用H.264的残差、宏块分割和比特分配等特征作为输入，以预测HEVC的CU分割。实验结果表明，该方法在复杂度降低和PSNR性能上优于现有的H.264到HEVC转码技术。" 本文主要探讨的是视频编码领域中的一种优化策略，即从H.264视频编码标准向高效率视频编码（HEVC）转换时的效率提升。H.264是一种广泛使用的视频编码标准，而HEVC则是其后继者，旨在提供更高的压缩效率，但转码过程可能会增加计算复杂度。文章的核心在于提出了一种基于LSTM的人工智能算法，用于预测转码过程中的CU（Coding Unit，编码单元）分割决策。CU是HEVC编码的基本单元，它的分割策略直接影响到编码效率和视频质量。LSTM是一种特殊的循环神经网络，擅长处理序列数据中的长期依赖关系，因此适合于分析连续的视频帧。作者首先对H.264编码特性和CU分割模式进行了深入分析，找出了两者之间的关联性。接着，他们构建了一个层次结构的LSTM模型，该模型能够考虑不同层次的上下文信息，以更准确地预测HEVC编码时的CU分割。LSTM模型的输入包括H.264编码的残差信息（反映原始图像与编码后的差异）、宏块分割信息（MB Partition，H.264中的基本编码单元）以及比特分配信息（影响编码效率和图像质量的关键因素）。实验结果显示，所提出的LSTM预测方法在减少转码复杂度和提高视频质量（以峰值信噪比PSNR衡量）方面都优于当前的主流技术。这表明，通过智能学习方法优化转码策略，可以有效地平衡编码效率和转码性能，为实际应用提供了有价值的解决方案。这篇文章针对H.264到HEVC转码的挑战，提出了一种基于LSTM的CU分割预测方法，通过利用H.264编码的特征，提高了转码过程的效率和视频质量，对于视频编码和转码领域的研究具有重要的参考价值。

An LSTM Method for Predicting CU Splitting in

H.264 to HEVC Transcoding

Yanan We i

, Zulin Wang

#∗

,MaiXu

, Shuhao Qiao

School of Electronic and Information Engineering, Beihang University, Beijing, China

∗

Collaborative Innovation Center of Geospatial Technology, Wuhan, China

Corresponding Author: Mai Xu(maixu@buaa.edu.cn)

Abstract—For H.264 to high efﬁciency video coding (HEVC)

transcoding, this paper proposes a hierarchical Long Short-

Term Memory (LSTM) method to predict coding unit (CU)

splitting. Speciﬁcally, we ﬁrst analyze the correlation between

CU splitting patterns and H.264 features. Upon our analysis, we

further propose a hierarchical LSTM architecture for predicting

CU splitting of HEVC, with regard to the explored H.264

features. The features of H.264, including residual, macroblock

(MB) partition and bit allocation, are employed as the input

to our LSTM method. Experimental results demonstrate that

the proposed method outperforms the state-of-the-art H.264

to HEVC transcoding methods, in terms of both complexity

reduction and PSNR performance.

Index Terms—H.264, HEVC, Transcoding, LSTM, CU splitting

I. INTRODUCTION

Transcoding is a technique which converts video stream

from one encoding into another. Alongside the evolution

of video coding standards, compression efﬁciency has been

gradually improved. As a result, several video coding stan-

dards (e.g., MPEG-1, MPEG-2, MPEG-4, H.263, H.264 and

high efﬁciency video coding (HEVC)) co-exist in a certain

range of applications, which makes transcoding desirable.

Video transcoding is a proper solution that bridges the gap

in sharing multimedia contents across various types o f mul-

timedia devices (e.g., television, computer, laptop, tablet and

smart phone). Therefore, transcoding has attracted increasing

attention [1].

In the past two decades, many transcoding algorithms have

been proposed with promising performance. However, the lat-

est video coding standard HEVC, which achieves outstanding

coding efﬁciency at the cost of large computational com-

plexity, still challenges the existing transcoding algorithms.

As the state-of-the-art video coding standard, HEVC offers

excellent rate-distortion performance and supports higher res-

olution video coding. As a result, a large number of videos

are encoded by HEVC over the past few y ears. Meanwhile,

more and more terminals tend to adopt this new standard. On

the other hand, extensive video streams encoded by previous

H.264 standard need to be transcoded into HEVC domain. To





ϭϬϵĨƌĂŵĞ

ϭϭϬĨƌĂŵĞ

ϭϭϯĨƌĂŵĞ

ϭϭϳĨƌĂŵĞ

ϭϮϭĨƌĂŵĞ





ϭϲϵĨƌĂŵĞ

ϭϳϬĨƌĂŵĞ

ϭϳϯĨƌĂŵĞ

ϭϳϳĨƌĂŵĞ

ϭϴϭĨƌĂŵĞ

^ĂŵĞhƐƉůŝƚƚŝŶŐ

Fig. 1. Two e xamples of the temporal similarity of CU partition.

this end, efﬁcient transcoding from H.264 to HEVC receives

a great deal of research effort.

In fact, H.264 to HEVC transcoding can be accomplished

by a fully H.264 decoding process and then a fully HEVC

encoding process. However, such procedures result in in-

efﬁciency as HEVC encoding is rather time-consuming. In

particular, coding tree unit (CTU) pa rtition of HEVC takes up

high computational time [2], as all possible splitting patterns

of coding unit (CU) need to be traversed for rate-distortion

optimization. Thus, it is important to predict CU partition

of HEVC according to H.264 bitstreams, when designing an

efﬁcient transcoding method. The methods for H.264 to HEVC

transcoding can be divided into two categories: either heuristic

or data-driven. Heuristic methods normally leverage or extract

some speciﬁc knowledge in compressed bitstream, combining

with human knowledge, to accomplish the transcoding from

H.264 to HEVC. For example, in [3], the variance of motion

vectors (MVs) of four H.264 macroblocks (MBs) is used

to explore the possibility of merging to form larger CU in

HEVC. Mor a et al. [4] applied motion similarity of H.264

MBs to build a f usion map, which is u sed to limit the depth

of CU in HEVC code d frames. Compared with heuristic

methods, data-driven methods make full use of training data

to accomplish CU splitting in H.264 to HEVC transcoding,

which achieves better performance than heuristic methods.

In [2], [5], [6], [7], linear discriminant is applied to map

the MB in H.264 to 64 × 64 or 32 × 32 CUs in HEVC.

Decision tree is utilized in [8] for fast CU splitting d ecision

during H.264 to HEVC transcoding, in light of a mining

VCIP 2017, Dec. 10 – 13, 2017, St Petersburg, U.S.A.

下载后可阅读完整内容，剩余3页未读，立即下载

weixin_38723527

粉丝: 3
资源: 953

HEVC转码CU拆分预测：基于LSTM的方法

H.264到HEVC视频转码技术研究.pdf

新一代高效视频编码H.265HEVC原理、标准与实现 2014年版

基于概率理论的从H.264 / AVC到H.265 / HEVC转码视频的客观质量评估方法

H.264到HEVC快速转码技术与视觉显著性分析

基于Fisher判别分析的H.264 / AVC到HEVC转码的快速CU分区

H.265HEVC视频编码规范标准2018年9月版本.zip_h.265_hevc_hevc 官方_标准_规范

H.265 标准 HEVC

H.265/HEVC_HM代码

Codec Visa H.264 HEVC 解码

H.265/HEVC压缩编码标准

最新资源