空间辅助低延迟Wyner-Ziv视频编码技术

PDF格式 | 508KB | 更新于2024-08-27 | 143 浏览量 | 举报

"Spatial-Aided Low-Delay Wyner-Ziv Video Coding" 这篇研究论文探讨了分布式视频编码中的一个重要问题，即如何提高Wyner-Ziv (WZ)帧编码的效率。在分布式视频编码中，侧信息（Side Information, SI）的质量对WZ编码的效果至关重要。通常，SI是由解码器通过运动补偿插值（Motion Compensated Interpolation, MCI）从过去的和未来的基准帧生成的，假设相邻帧之间的运动轨迹是平移且速度恒定。然而，这种假设并不总是成立，尤其是在具有高运动复杂性的视频场景中，这会导致编码效率不理想。 Wyner-Ziv编码是一种熵编码技术，用于在解码端利用已知的侧信息来提高压缩效率。在低延迟视频编码中，快速且准确的SI生成是关键，因为它直接影响到解码后的视频质量。论文提出的“空间辅助低延迟Wyner-Ziv视频编码”方法，旨在解决这一挑战。作者Bo Wu、Xiangyang Ji、Debin Zhao和Wen Gao提出了一种新的策略，该策略利用了空间信息来增强SI的生成，从而改进WZ编码的性能。这种方法可能包括更精确的运动估计技术，可能考虑了非线性运动、多模式运动或者更复杂的运动模型，以更好地匹配实际视频序列中的运动特性。论文中可能详细分析了现有MCI方法的局限性，并提出了改进方案，以适应不同类型的视频内容。通过实证研究，作者们可能展示了新方法在编码效率和视频质量方面的优势，可能包括更低的比特率、更小的延迟以及与传统WZ编码相比更高的视觉质量。此外，论文可能还讨论了算法的实现细节，如计算复杂性和内存需求，这对于实际应用非常重要。最后，可能还进行了与其他先进编码技术的比较，以证明所提方法的有效性和实用性。 "Spatial-Aided Low-Delay Wyner-Ziv Video Coding"这篇论文为分布式视频编码提供了一个创新的解决方案，通过改进侧信息的生成，提高了WZ编码在处理高运动复杂性视频时的性能，对视频压缩领域具有重要的理论和实践意义。

EURASIP Journal on Image and Video Processing 3

Key

BP 0

Buffer

Reconstruction

Side information

Delay

Quantizer

Extract

bit-planes

Turbo

encoder

Intra

encoding

CA-VLC

encoder

CA-VLC

decoder

Intra

decoding

SA-MCE SI

generation

Turbo

decoder

Auxiliary

information

generation

DCT

Request bits

Decoder

Encoder

−

quantizer

Figure 1: Framework of spatial-aided low-delay WZ codec.

information. As a result, the resolution of the spatial auxiliary

information is a quarter of the original frame. To reduce

the temporal redundancy, DPCM is performed between the

adjacent LL subbands to encode the LL subband. For DPCM

coding, the diﬀerence between the current LL subband and

its previously reconstructed reference frame is calculated.

Then the residues are DCT transformed and quantized by

a quantizer. Finally, the quantized coeﬃcients are encoded

by a CA-VLC entropy encoder used in H.264/AVC. If the

reference frame is a key frame, the LL subband of full-

resolution reconstructed intra-frame needs to be yielded by

DWT to form the reference frame for DPCM coding.

2.3. Wyner-Ziv Frame Coding. At the encoder, the whole WZ

frame is encoded by DCT transform domain WZ coding [3].

First, a block-wise DCT is applied to the whole WZ frame

and the statistical dependencies within a frame are exploited.

The transform coeﬃcients are grouped together to form

the coeﬃcient bands. Then for each band, diﬀerent M-level

uniform scalar quantizers are applied. Next, the bit-planes

are extracted and each bit-plane is organized to ﬁxed length

binary codewords. Each codeword is sent to the Slepian-Wolf

(SW) encoder as input and the output is the parity bits. The

SW coder is implemented using a rate-compatible punctured

turbo code (RCPT). Then, these parity bits are punctured

into diﬀerent blocks and stored in a buﬀer. The blocks of

parity bits, which are also called as WZ bits, are successively

transmitted to decoder upon request.

At the decoder, the spatial auxiliary information of

current WZ frame is decoded ﬁrst. Then, the SI of whole

WZ frame is generated with the help of the auxiliary

information by an SA-MCE method which is presented in

Section 2.5. Subsequently, DCT is applied on the generated

full-resolution SI and the coeﬃcients in each DCT block

are extracted into diﬀerent subbands corresponding to the

DCT bands partition patterns. The DCT coeﬃcient Y

SI at the ith position in current subband is used for the

bit-plane probabilities evaluation. This means that for every

original coeﬃcient X

the value of Y

is used to evaluate

the probability of every bit of X

being 1 or 0. The detailed

description about the probability evaluation and correlation

model being used is introduced in the next subsection.

2.4. Correlation Model. As the turbo decoder obtains the side

information, a priori probability of current decoding bit-

planes should be calculated ﬁrst. According to simulation

results, the probability distribution of the diﬀerence between

the source and its SI conforms to a Laplacian model and

thus, the Laplacian model is taken as the probability density

function for calculating the a priori probability. To estimate

the values of the jth bit of X

being 0 or 1, the probability can

be calculated as



| y

, s

, b

, ..., b

j−1



−α|d|

(1)

with

= a ·



+ oﬀset



− y

= a



· 2

+ ···+ b

j−1

· 2

m− j+1

+ b

· 2

m− j



m− j−1



− y

(2)

Let b

denote the jth bit-plane at the position i in current

subband and its estimation is

.However,{b

, ..., b

j−1

} are

those previously decoded bits and b

is the most signiﬁcant

bit. In (1), S

is the sign bit. If the coeﬃcient X

is positive,

equals 0; otherwise S

equals 1. For each coeﬃcient

band, diﬀerent standard deviation of Laplacian model 1/α is

adopted. The value of 1/α is determined by oﬄine training.

In (2), Z

represents the integer number that has

the jth bit

and those previously more signiﬁcant bits

, ..., b

j−1

}. Oﬀset is an estimated value used to compen-

sate the lower part of Z

.IfX

is partitioned into m bins,

oﬀset equals 2

m− j−1

. a is used to adjust the sign of the value

+ oﬀset), which is deﬁned as



1 s

= 0,

−1 s

= 1.

(3)

剩余10页未读，继续阅读

weixin_38627826

粉丝: 5

空间辅助低延迟Wyner-Ziv视频编码技术

lucene-spatial-7.7.0 中英文API对照文档快速指南

磁共振成像与光谱工具箱：Spectral-Spatial-RF-Pulse-Design

3D空间点展示工具：spatial-salver与Three.js的应用

Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion

Real-Time Video Stylization Using Spatial-Temporal Gabor Filtering

Pyramid Spatial-Temporal Aggregation for Video-based Person Re-Identification(2022-05-10-21-08-15).marginpkg

vuforia-spatial-extension-addon

hifi-spatial-audio-js

spatial-search-showdown

spatial-transcriptomics-pipeline

最新资源