3D视频中深度图编码的帧间模式选择策略优化

50 浏览量更新于2024-08-30 收藏 288KB PDF 举报

本文档探讨了在三维视频（3D Video）编码中的一个重要议题，即帧间模式（Inter-Mode）选择对深度图（Depth Map）编码的影响。随着3D视频技术的发展，它通常包含传统的二维视频（2D Video）和与其相关的深度图数据。在3D视频编码过程中，这两种类型的数据需要进行联合编码，通常采用双通道编码方法，即基于两个独立的编码器分别处理彩色视频和深度图。当前的3D视频编码系统倾向于使用并行的编码策略，这种方法的优点是可以同时优化色彩和深度信息的压缩效率。然而，这种设计带来了复杂性和硬件需求的双重挑战，其编码复杂度和所需硬件资源相比于单独编码2D彩色视频几乎翻倍。尽管2D彩色视频编码的研究已经非常广泛，深度图编码却相对较少受到关注，这表明深度图编码的技术瓶颈可能尚未充分挖掘。深度图在3D中扮演关键角色，它表示物体与摄像头的距离，具有世界坐标系中Z轴数据的特性，同时也包含了视频信号的运动信息。因此，深度图编码时的预测模式（Prediction Mode）、参考帧（Reference Frame）以及运动矢量（Motion Vector）之间存在着高度的相关性。有效利用这些相关性可以显著提高编码效率，降低码率，减少冗余信息。本文研究者 Liquan Shen、Zhaoyang Zhang 和 Zhi Liu 在这篇投稿于2012年8月《消费电子》（IEEE Transactions on Consumer Electronics）的研究论文中，提出了针对3D视频深度图编码的帧间模式选择策略。他们可能探讨了如何通过改进预测算法、优化参考帧选择或者开发新的编码框架，来降低深度图编码的复杂度，同时保持或提升图像质量和编码效率。这项工作对于推动3D视频技术的实用化和标准化具有重要意义，有助于降低整体系统成本，提升用户体验，尤其是在移动设备和家庭娱乐系统等应用场景中。未来的研究可能继续深入到深度图编码的并行化、编码器结构优化以及与其他编码技术如H.265/HEVC等的融合等方面。

926 IEEE Transactions on Consumer Electronics, Vol. 58, No. 3, August 2012

Contributed Paper

Manuscript received 06/24/12

Current version published 09/25/12

Inter Mode Selection for Depth Map Coding

in 3D Video

Liquan Shen, Zhaoyang Zhang, Zhi Liu

Abstract —3D video (3DV) data usually includes both

conventional 2D videos and corresponding depth maps. In 3DV

coding, color videos and depth maps need to be jointly coded.

Usually, the system of 3DV coding uses the two-channel coding

method which encodes color videos and depth maps based on

two parallel codec implementations. The complexity and

hardware requirements are nearly two times higher than coding

2D color videos. While low-complexity color video coding has

been broadly studied, depth map coding has received much less

attention. The depth map represents the distance from cameras

to objects, which has characteristics of both data on Z-axis in

world coordinate and the video signal. Thus, there is a high

correlation among motion information (the prediction mode,

reference frame and motion vector) from color videos and depth

maps. Based on this observation, we propose one method to

reduce depth coding complexity. An experimental analysis is

performed to study the prediction mode correlation in coding

information from color videos and depth maps. Based on the

correlation, we propose an efficient mode decision algorithm.

With almost the same RD performance, the proposed algorithm

can reduce about 70% computational complexity of depth

coding, which is beneficial for real-time realization through a

hardware or software implementation for the 3DV applications

Index Terms —3DTV/FTV, depth coding, mode decision.

I. INTRODUCTION

Although the multi-view video (MVV) can provide both the

immersive sense of realism and the function of free viewpoint

navigation, it still has some problems to be directly used for

3DTV or free viewpoint television (FTV) systems [1]. The

performance of the MVV system highly depends on the number

of original views. Thus, the system must capture a very large

number of views and encode a huge amount of multi-view data

to display a realistic 3D scene with multi-viewpoints at the

decoder side. The main challenge of the MVV system is high

requirements of storage and transmission bandwidth. To solve

this problem, Moving Pictures Experts Group (MPEG) has

initiated work toward a new standard for 3DTV and FTV,

referred to as 3DVC (3D video coding). Recently, new data

1 This work is sponsored by Shanghai Rising-Star Program

(11QA1402400) and Innovation Program of Shanghai Municipal Education

Commission, and is supported by the National Natural Science Foundation of

China under grant No. 60832003, 60902085 and 61171084.

The authors are with the Key Laboratory of Advanced Display and System

Application, Shanghai University, Ministry of Education, Shanghai, 200072,

China (e-mail: jsslq@163.com)

formats including captured 2D video sequences and

corresponding depth maps have been proposed for 3DVC. With

color videos and depth maps, virtual views can be generated

using Depth-Image-Based-Rendering (DIBR) techniques [2].

The depth image represents a relative distance from a camera to

an object in the 3D space, which is widely used in computer

vision and computer graphics fields to represent 3D information.

Since depth maps are required to be transmitted together with 2D

videos, depth compression needs to be investigated in 3DV

coding. Typically, a depth image consists of depth samples with

each sample represented by a scaled 8 bits value and

corresponding to a pixel in the video frame. It can be regarded as

a typical grayscale image/video [3]. Thus the most

straightforward approach to compress depth map sequences is to

encode them using conventional image/video compression

algorithms such as H.264 or joint multiview coding (MVC).

Recently, depth coding techniques could be classified into

two main groups depending on the relation with color video

coding: independent coding and joint coding. Independent depth

coding techniques encode the depth image using the

characteristics of depth data. An independent depth video

coding based on platelet is proposed in [4], which employs a

quad-tree decomposition that divides the image into blocks and

model functions of 4 types based on platelet. A mesh-based

depth coding is proposed in [5] to improve compression

efficiency. The main problem with these independent coding

schemes is that their coding efficiency is not high since

redundancies between the color video and the corresponding

depth map are not explored. On the other side, joint coding

algorithms proposed in [6-9] consider the correlation between

the depth map and the corresponding video. Motion information

from the color video is utilized to improve efficiency of depth

map coding in [6-7]. A joint coding method for both the color

video and the depth map in [8] uses the concept of the layered

depth image to represent and process multi-view video with

depth. The depth map coding for view synthesis is proposed to

improve the view rendering quality in [9]. However, these joint

depth coding techniques focus only on the improvement of

depth coding efficiency and do not evaluate the coding

complexity. Usually, the system of 3DV coding uses the two-

channel coding method which encodes color video and depth

map sequence based on two parallel H.264 codec

implementations. The complexity and hardware requirements

are nearly two times higher than coding 2D videos. While low-

complexity color video coding has been broadly studied, depth

coding has received much less attention. That is, how to code

the depth map sequence efficiently is an important issue.

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38716563

粉丝: 5
资源: 871

3D视频中深度图编码的帧间模式选择策略优化

3D-HEVC的低复杂度深度图帧内预测模式决策算法

用于3D视频编码中帧内预测的快速深度图楔形分割方案

3d图转深度图视差图

halcon3d深度图转彩色

3d相机 深度图 原始深度怎么看

halcon深度图生成3d

open3d点云转深度图

halcon3d点云转深度图

open3d 深度图生成三维点云

open3d深度图转点云

最新资源

3d相机深度图原始深度怎么看