降低计算复杂度的多视点视频编码运动与视差估计优化

169 浏览量更新于2024-08-27 收藏 1.58MB PDF 举报

本文主要探讨了"低复杂度多视点视频编码的有效运动和视差估计优化"这一主题，针对多视点视频编码（MVC）中的关键编码技术——可变块大小运动估计（MBME）、视差估计（DE）以及多个参考帧选择，提出了一种旨在降低计算复杂性的改进方法。随着MVC在实时视频广播等应用中的需求增加，传统的高级编码技术由于其高计算负载而显得不适用。首先，作者提出了一个基于编码块模式特性和率-失真（RD）成本的早期直接模式决策算法。这个算法旨在简化编码过程，通过快速识别出可能的最优决策，减少了不必要的计算步骤。它利用了编码数据的统计特性，提高了决策效率。其次，针对运动估计和视差估计过程中初始搜索点的特性，论文提出了一种早期终止策略。通过分析搜索过程中的观察结果，发现最佳点往往倾向于搜索区域的中心位置。因此，当满足一定的条件时，搜索会提前结束，进一步节省计算资源。如果早期终止策略未能满足性能要求，论文还探讨了如何通过缩小搜索窗口来进一步减小计算复杂性。这种方法允许系统在保持足够编码精度的同时，降低对计算能力的需求，从而更好地适应实时视频编码的实时性要求。本文的主要贡献在于设计了一个兼顾编码效率和计算成本的MBME和DE算法，这对于在多视点视频编码中实现高效且实时的传输至关重要。该算法对于减少实时视频广播等应用场景下的编码负担具有实际价值，为未来多视点视频编码技术的发展提供了一种实用的优化方案。

168 IEEE TRANSACTIONS ON BROADCASTING, VOL. 61, NO. 2, JUNE 2015

TABLE I

EST CONDITIONS

TABLE III

CPU T

IME DISTRIBUTION IN ENCODING PROCESS (UNIT:%)

T ={DIRECT, SubDIRECT, I16MB, I8MB, I4MB,PCM},

(2)

where the DIRECT mode in T represents either DIRECT mode

in B frame or SKIP mode in P frame.

Let event A denote the candidate mode in S is selected as

the optimal mode, B represents the candidate mode in T is

selected as the optimal mode. P(A), P(B) are the probability

of event A and event B, respectively.

We test four multiview video sequences with various

motion activities to analyze the probabilities P(A) and P(B).

“Ballroom” moves fast. “Ballet” is with medium motion.

“Exit” and “Doorﬂowers” are with slow motion. The test

conditions are listed in Table I. The statistical results of

probabilities P(A) and P(B) are tabulated in Table II.

From Table II, it is observed that there is large number

of MBs which select mode from T as their optimal mode.

In the even views, the probability P(B) is from 67.26% to

96.23%, and 84.17% on average. In the odd views, the P(B)

is from 65.86% to 95.01%, and 82.00% on average. Another

observation on Table II is that there are 15.83% and 18.00%

MBs encoded as the mode in set S in the even and odd

views, respectively. These values demonstrate that most MBs

(84.17%) are encoded as the mode in

T and only a small

number of MBs (15.83%) require the ME/DE.

The candidate modes in set S have a small probability to be

selected as the optimal mode, however, it consumes the major

proportion of total encoding time. According to the experimen-

tal results in Table III, the process of checking the candidate

modes in S consumes 97.86% and 99.00% encoding time of

total encoding time in the even and odd views, respectively.

The encoding time is mainly consumed by the ME and DE,

which are used to remove the temporal and inter-view redun-

dancies, respectively. Hence, if the ME and DE process can

be simpliﬁed, signiﬁcant encoding time can be saved.

To further analyze the candidate modes in T, let event C

represent DIRECT is selected as the optimal mode, D denotes

that INTRA mode to be selected as the optimal mode. We give

the conditional probabilities P(C|B) and P(D|B) in Table II.

It means that events C and D are triggered given that B has

occurred. From Table II, we can see that P(C|B) holds from

86.82% to 98.52%, 93.62% on average in the even views. In

the odd views, P(C|B) is from 95.64% to 99.97%, 98.75%

on average. On the contrary, P(D|B) holds a relatively small

probability, 6.38% and 1.25% for the even and odd views,

respectively.

Based on the conditional probability theory, we can obtain

P(BC) = P(C|B)P(B), (3)

and

P(BD

) = P(D|B)P(B). (4)

From Eqs. (3), (4) and Table II, we can have that P(BC)

holds a larger probability than P(BD) for about 70%. For

instance, P(B) and P(C|B) are 84.17% and 93.62%, 82.00%

and 98.75%, in the even and odd views, respectively. Thus,

according to Eq. (3), P(BC) equals to 78.80% and 80.98% in

the even and odd views, respectively. From Table II, P(D|B)

equals to 6.38% and 1.25% on average in the even and odd

views, respectively. Hence, based on Eq. (4), P(BD) equals

to 5.37% and 1.02% in the even and odd views, respectively.

From Table III, it is observed that compared to the CPU time

used in S, T consumes quite small encoding time, which are

2.14% and 1.00% in the even and odd views, respectively.

Based on these values, we can conclude that INTRA modes

in T have a little probability to be selected as the optimal mode

and with quite low computational complexity. Therefore, we

can have two ﬁndings. 1) P(B) holds a large probability, but

the process of encoding the candidate modes in T consumes

little coding time. Thus, compared with early termination for

the modes in S, it is more reasonable to perform early ter-

mination for the modes in T.2)P(BC) is much larger than

(BD) and is the major part of the P(B). Thus, we mainly do

optimization for the DIRECT mode early termination.

III. P

ROPOSED EFFICIENT ME/DE ALGORITHM

A. Early DIRECT Mode Decision

Coded block pattern (CBP) is a syntax element in the

encoded MB header that speciﬁes six 8 × 8 blocks, including

four luma blocks and two chroma blocks, for 4:2:0 sub-

sampling [23]. If the CBP value equals to 0, it represents that

all six 8 × 8 blocks don’t have non-zero quantized transform

coefﬁcients. Hence, if the CBP value of DIRECT mode is

equal to 0, the current MB is suitable for being encoded as

DIRECT mode. As a result, the other candidate mode will be

skipped and signiﬁcant encoding time will be saved.

Four multiview video sequences (“Ballroom”, “Exit”,

“Vassar” and “Doorﬂowers”) are used to analyze the CBP

values when the MB is encoded as DIRECT mode. The exper-

imental conditions are tabulated in Table I. The statistical

results are listed in Table IV. It is observed that when one MB

is encoded as DIRECT mode, the probability of the CBP value

剩余10页未读，继续阅读

weixin_38641366

粉丝: 4
资源: 893

降低计算复杂度的多视点视频编码运动与视差估计优化

多视点视频编码中的运动和视差估计快速算法 (2011年)

基于分层B帧的多视点视频编码快速运动与视差估计算法

基于Motion Skip模式的低时延随机访问多视点视频编码方法 (2010年)

分布式多视点视频编码中边信息生成的研究.pdf

基于宏块位置约束模型的多视点视频编码自适应模式决策

多视点编码中一种快速的模式选择方法

无线多视点视频编码：网络驱动的低复杂度方案

优化多视点视频编码：低时延随机访问与Motion Skip模式

联合多视点视频编码中的TZSearch快速搜索算法优化分析

多视点视频编码优化：宏块位置约束模型自适应模式决策

最新资源