内容自适应图像子采样提升立体声交织视频编码效率

57 浏览量更新于2024-08-26 收藏 619KB PDF 举报

本文主要探讨了"具有内容自适应图像二次采样的立体声交织视频编码"这一主题。立体声交织视频编码是一种针对立体视频的有效编码策略，它将左视图和右视图的帧分别降采样到一半大小，然后合并成一个单一的帧，再通过传统的2D视频编码器进行编码。传统的做法是使用固定子采样滤波器系数来处理每个帧，这种方法简单易实现，但未能充分考虑到帧信号随内容变化的特性。作者们注意到，由于子采样过程中可能会引入失真，而在压缩编码过程中还会出现量化误差，这共同构成了立体声交织视频编码中的最终失真。为了改进这一问题，他们进行了深入的速率失真分析，试图找到一种更优化的方法。这就是内容自适应图像二次采样（CAIS），它根据帧内容动态调整子采样滤波器的系数，以减少失真并提高编码效率。相比于固定滤波器策略，CAIS可以根据帧内容的变化实时选择最合适的子采样策略，从而提供更好的编码性能。实验结果显示，CAIS在实际应用中展现出了显著的优势，能够有效地提升立体声交织视频编码的压缩效率，使得在相同的比特率下，可以获得更高的图像质量和更少的失真。这项研究对于优化立体视频编码技术，尤其是在高数据传输需求的场景下，如虚拟现实和增强现实应用，具有重要的理论和实践价值。然而，值得注意的是，本文内容可能还未经过最后的编辑，版权信息表明个人使用许可有限，如果用于其他目的，必须获得IEEE的正式授权。该文章已被接受发表在未来某期期刊上，但具体内容可能会在最终出版前有所调整。本文的核心贡献在于提出了一种创新的编码策略——内容自适应图像二次采样，它在保持编码效率的同时，提升了立体声交织视频的质量，是当前立体视频编码领域的前沿进展。

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

Stereo Interleaving Video Coding with Content Adaptive Image

Subsampling

Yongbing Zhang, Xiangyang Ji, Haoqian Wang, and Qionghai Dai



Abstract—Stereo interleaving video coding, where both left and

right view frames are subsampled into half size and multiplexed

into one single frame before encoded by a traditional 2D video

encoder, is an efficient encoding scenario for stereoscopic video.

Many existing stereo interleaving video coding methods

subsample each frame utilizing fixed subsampling filter

coefficients. Such methods are easy to implement, however the

varying property of the frame signal is ignored. By jointly

considering the influences of subsampling and compression, a

rate and distortion analysis about stereo interleaving video

coding is proposed. The final distortion in stereo interleaving

video coding is the summation of errors caused by subsampling

(causing distortion between subsampling-interpolated image and

the original full resolution one) and by quantization during

compression. Based on the provided rate distortion analysis, a

content adaptive image subsampling (CAIS) is also proposed. In

CAIS, the half size frames are generated by the optimal

subsampling filters, which are calculated based on frame

contents and the targeted interpolation coefficients.

Experimental results demonstrate that the proposed CAIS is able

to greatly improve compression efficiency of stereo interleaving

video coding.

Index Terms—Stereo interleaving video, frame packing

arrangement, content adaptive, rate distortion analysis

I. INTRODUCTION

N recent years, stereoscopic video has drawn significant

attention with more and more products and services

becoming available in the consumer markets. Stereoscopic

video, a type of visual media that provides depth perception of

the observed scenery, creates a perception of 3D using two 2D

images [1]. Each 2D image is selectively targeted at either left

eye or the right eye in a way designed to recruit the brain’s

natural depth sensing abilities. The 3D depth perception can be

provided by 3D display systems which ensure that the user

observes a specific different view with each eye [2]. With the



Manuscript received March 10, 2012; revised June 19, 2012. This work was

partially supported by National Science Foundation of China (61170195), the

Joint Funds of National Science Foundation of China (U0935001), the

Upgrading Project of Shenzhen Key Laboratory ( CXB201005260071A), and

the Basic Research Plan in Shenzhen City (JC201005310709A and

JC201105201110A). This paper was recommended by Associate Editor

Levent Onural.

Y. Zhang and H. Wang are with Shenzhen Key Laboratory of Broadband

Network & Multimedia, Graduate School at Shenzhen, Tsinghua University ,

Shenzhen 518055, China. (e-mail: ybzhang@tsinghua.edu.cn;

wanghaoqian@tsinghua.edu.cn).

X. Ji and Q. Dai are with the Broadband Networks and Digital Media

Laboratory, Automation Department, Tsinghua University, Beijing 100084,

China (e-mail: xyji@tsinghua.edu.cn; qhdai@tsinghua.edu.cn).

However, permission to use this material for any other purposes must be

obtained from the IEEE by sending an email to pubs-permissions@ieee.org.

rapid increase of 3D contents emerging, stereoscopic video is

also an increasingly interesting technology for home user

living room and mobile 3D video services [3].

Compared to 2D video, stereoscopic video has doubled the

amount of data due to the existence of an extra view.

Consequently, additional bandwidth will be required for

transmission and storage [4], which imposes a high demand on

the efficient compression of stereoscopic video. There are

various ways to encode the stereoscopic video. The most direct

one is simulcast video coding, which encodes the left and right

view frames independently. This can be easily realized by

traditional 2D video coding system. However, the bit rate in

simulcast video coding will be doubled, which imposes a great

challenge on the existing transmission and encoding system.

The other way is to encode the stereoscopic video by exploring

the inter-view redundancies, namely inter-view prediction [5]

[6]. Under this encoding scenario, the left view video is

encoded by traditional 2D video encoder, for example

H.264/AVC [7], while the reconstructed left view frame can

also be referenced by the right view frames. This method is

able to significantly improve the encoding performance. To

efficiently compress the stereoscopic video, MPEG added

Stereo High Profile [8] in July 2009 to deal specially with the

case in which the multiple views of Multiview Video Coding

(MVC) were the left and right stereo views. The Stereo High

Profile limits the number of encoded views to two, and

includes support for interlaced coding tools – such that the

resulting profile supports the same set of coding tools as in the

prior high profile, but with stereo inter-view prediction

enabled. However, inter-view prediction encoding scenario

needs to upgrade the existing infrastructure and equipment,

since additional bandwidth cannot be avoided. Alternatively,

stereo interleaving video encoding scheme [9-12] can be

explored using existing 2D video coding methods.

Compared with the former two methods, stereo interleaving

video encoding scenario facilitates the introduction of

stereoscopic services without upgrading the existing

infrastructure and equipment. Besides, stereo interleaving

video encoding easily supports the synchronization between

the two views [3]. As a result, stereo interleaving video

encoding scheme receives considerable attention from

broadcast industry. Many pioneering works have been done to

improve the efficiency of stereo interleaving video coding. For

example, an adaptive interpolation has been proposed to

improve the efficiency of stereo interleaving video coding [11],

where segmentation is performed within each frame and a

common interpolation mode is applied for each segmentation

part. However, the information of interpolation modes should

be transmitted to the receiver, which is not compatible to the

existing 2D video coding standards. In addition, [12] proposed

an enhanced rate distortion optimization method, which

utilizes the distortion between upsampled reconstructed block

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38716872

粉丝: 2

内容自适应图像子采样提升立体声交织视频编码效率

frameloom:将视频帧交织在一起，形成闪烁的影片

自适应图像去噪增强_自适应图像去噪_

音响放大器设计：声场模拟与扬声器配置完全手册

cole_02_0507.pdf

工程硕士开题报告：无线传感器网络路由技术及能量优化LEACH协议研究

【东海期货-2025研报】东海贵金属周度策略：金价高位回落，阶段性回调趋势初现.pdf

图像数据处理工具+数据(帮助用户快速划分数据集并增强图像数据集。通过自动化数据处理流程，简化了深度学习项目的数据准备工作)

diminico_02_0709.pdf

agenda_3cd_01_0716.pdf

A课件Python全栈开发线下班.zip

最新资源