超像素匹配深度传播：2D到3D转换的联合双边滤波方法

84 浏览量更新于2024-08-30 收藏 855KB PDF 举报

"基于超像素匹配的深度传播与联合双边滤波在2D-3D转换中的应用" 本文的研究论文探讨了一种创新方法，用于2D视频到3D视频的转换，具体是通过基于超像素匹配的深度传播，并结合联合双边滤波技术。这种方法旨在提高2D到3D转换的质量，生成更为真实且细节丰富的3D深度图。首先，作者们提出使用超像素匹配来估计超像素级别的运动向量，而不是传统的块匹配方法。超像素是图像中具有相似属性（如颜色、纹理和亮度）的一组像素集合，因此在估计运动向量时，超像素匹配能够提供比块匹配更精确的结果。这种方法可以更准确地捕捉场景中的物体运动，从而提升深度信息的准确性。接着，根据所估计的运动向量，执行深度补偿来获取当前帧的深度图。然而，由于每个超像素的大小可能不一致，这可能导致补偿后的深度图出现匹配误差。为了解决这个问题，文章引入了联合双边滤波技术。联合双边滤波是一种边缘保持的平滑方法，它考虑了像素的空间邻近性和色彩相似性，能有效地减少匹配误差，同时保持图像边缘的清晰度。实验结果显示，提出的算法成功实现了深度信息的精确传播，生成的深度图质量高，对于2D到3D的转换效果显著。这一方法的应用对提高3D视频的观看体验具有重要意义，特别是在电影、游戏和虚拟现实等领域。关键词：2D-3D转换，超像素匹配，深度传播，联合双边滤波，运动向量，深度图。

SUPERPIXEL MATCHING-BASED DEPTH PROPAGATION FOR 2D-TO-3D CONVERSION

WITH JOINT BILATERAL FILTERING

Cheolkon Jung and Jiji Cai

School of Electronic Engineering, Xidian University, Xi’an 710071, China

zhengzk@xidian.edu.cn

ABSTRACT

In this paper, we propose superpixel matching-based depth

propagation for 2D to 3D video conversion with joint

bilateral filtering. First, we perform superpixel matching to

estimate motion vectors of superpixels from the reference

depth map instead of block matching. This is because a

superpixel is a group of pixels which have similar

characteristics and thus superpixel matching performs better

than block matching in estimating motion vectors. Then, we

conduct depth compensation based on the motion vectors to

obtain the current depth map. However, the size of each

superpixel is not exactly the same, which causes matching

errors in the compensated depth map. Thus, we perform

joint bilateral filtering to refine the depth map. Experimental

results show that the proposed algorithm successfully

performs depth propagation and produce high-quality depth

maps for 2D-to-3D conversion.

Index Terms—2D-to-3D conversion, depth

compensation, depth propagation, joint bilateral filtering,

superpixel matching.

1. INTRODUCTION

Stereoscopic three-dimensional (S3D) videos enhance

traditional viewing experience dramatically by providing an

immersive experience. However, available S3D contents are

insufficient for the proliferation of S3D display devices. To

overcome the S3D content shortage, 2D-to-3D conversion,

which estimates 3D information from monocular video

shots, is very useful for converting conventional 2D video

into S3D contents [1]. Up to now, current 2D-to-3D

conversion methods are classified into three main categories:

manual, full-automatic, and semi-automatic approaches [2].

Although manual methods can produce the most accurate

depth map for each individual frame, the incredible time

cost and human involvement make it unrealistic in most

applications. In contrast, full-automatic 2D-3D utilizes

depth clues such as motion, linear perspective, atmospheric

perspective, texture gradient, and relative height to estimate

structures about 2D scenes [3, 4] without any user

participation. However, these current automatic techniques

suffer from inaccurate depth estimation and less adaptation

to varying video contents, which lead to degrade the

viewing experience. In recent years, semi-automatic 2D to

3D conversion methods [5-10] provide more balance

between quality and cost than fully-automatic ones by

introducing human-computer interaction. A representative

approach to semi-automatic 2D to 3D conversion is depth

propagation. The main idea of depth propagation is to

manually or semi-manually create high-quality depth maps

at key frames, i.e. reference depth maps, and then propagate

the depth maps to non-key frames, i.e. current depth maps.

For manual depth assignment, some computer software (e.g.,

Photoshop) or algorithms (e.g., Lazy Snapping [11]) might

be used to facilitate the user’s operation, which is beyond

the scope of this paper. However, we focus on propagating

the reference depth map of key frames to those of non-key

frames. Recent approaches are mostly based on motion

estimation and compensation. For example, Varekamp and

Barenbrug [8] propagated depth information of key frames

to non-key frames via motion compensation. Then, Lie et al.

[9] introduced trilateral filtering to reduce the depth

propagation errors. Cao et al. [10] proposed a semiautomatic

conversion method that first performed multi-object

segmentation to create disparity maps for key frames and

then employed shifted bilateral filtering to propagate depth

into non-key frames. In general, they suffer from inaccurate

motion estimation and compensation due to complex local

motions or object occlusions. As shown in Fig. 1, the three

girl’s hands have complex local motions. Moreover, their

sizes and colors are too small and similar to background

because block matching only refer to color difference and

the specified size of a rectangular block, which leads to the

presence of big errors (See the red circles).

In this paper, we propose superpixel-based depth

propagation for 2D-to-3D conversion with joint bilateral

filtering.

(a) (b) (c)

Fig. 1 Inaccurate depth propagation results caused by block matching due

to complex local motions and object occlusions. (a) Reference color image.

(b) Current color image. (c) Estimated depth map by block matching.

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38539018

粉丝: 6
资源: 941

超像素匹配深度传播：2D到3D转换的联合双边滤波方法

双边滤波算法_深度图像滤波_深度图像双边滤波_深度图像_双边滤波_图像修复

基于联合双边滤波的图像去噪算法matlab仿真【包括程序操作视频】

OpenCV计算机视觉学习（4）——图像平滑处理（均值滤波，高斯滤波，中值滤波，双边滤波） - 战争热诚 - 博客园1

Fast Bilateral Filter：使用傅立叶近似的快速双边滤波。-matlab开发

联合双边滤波matlab代码-INS_medical-image-fusion:发表于信息科学

双边滤波 -Bilateral Filtering

Joint Bilateral filter:bfilter2 函数执行二维双边高斯滤波。-matlab开发

联合双边滤波上采样

matlab-基于matlab实现图像的双边滤波去噪和三边滤波去噪仿真-源码

双边滤波快速算法-图像处理方向

最新资源