利用自然场景统计学实现2D到3D转换

研究论文

169 浏览量更新于2024-08-30 收藏 3.34MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

"迈向自然2D到3D转换" 这篇研究论文深入探讨了如何将自然的2D图像或视频转换为统计上自然的3D视图。作者Weicheng Huang、Xun Cao、Ke Lu、Qionghai Dai以及Alan Conrad Bovik，都是在图像处理领域具有深厚学术背景的专业人士，他们利用 IEEE Transactions on Image Processing 这一权威平台发表了这一研究成果。论文的核心是自然场景统计（NSSs）模型，这些模型能够对自然场景的亮度、颜色和深度图进行有用的感知相关的约束。NSSs模型旨在模拟人眼对真实世界视觉信息的处理方式，以便在2D到3D转换过程中更准确地再现自然效果。在转换过程中，首先，通过人类注解获取关键帧的精确深度信息。这是至关重要的一步，因为深度信息的准确性直接影响到3D重建的质量。然后，论文采用了前向和后向运动矢量估计，通过比较它们来决定初始的深度值。接下来，一个补偿过程被应用，以进一步优化深度初始化，确保连续性和平滑度。之后，论文利用Gabor滤波器对亮度/色度和初始深度图进行分解。Gabor滤波器是一种常用的工具，可以有效地提取图像的结构信息。每个深度子带被建模以产生一个NSS先验项。这种分解和建模方法有助于捕捉不同尺度和方向上的深度变化，从而更接近自然场景的特性。论文进一步结合统计颜色-深度先验和空间平滑度约束，构建了深度传播的目标函数。这种方法考虑了相邻像素间的深度一致性，同时利用了颜色和深度之间的关系，以提高3D重建的逼真度和稳定性。通过这种方式，作者提出了一种算法，能够将单目2D视频转化为统计上与自然场景相符的3D可观看视频。这种方法不仅提高了3D转换的准确性，还增强了转换结果的感知质量，使得观众在观看时能感受到更为真实的立体效果。这篇论文在2D到3D转换技术上做出了重要贡献，通过利用自然场景的统计特性，它提供了一种创建更为逼真3D内容的新方法，对于电影、游戏和虚拟现实等领域有着广泛的应用前景。

资源详情

资源推荐

724 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 2, FEBRUARY 2015

Toward Naturalistic 2D-to-3D Conversion

Weicheng Huang, Xun Cao, Member, IEEE, Ke Lu, Qionghai Dai, Senior Member, IEEE,

and Alan Conrad Bovik, Fellow, IEEE

Abstract—Natural scene statistics (NSSs) models have been

developed that make it possible to impose useful perceptually

relevant priors on the luminance, colors, and depth maps of

natural scenes. We show that these models can be used to develop

3D content creation algorithms that can convert monocular

2D videos into statistically natural 3D-viewable videos. First,

accurate depth information on key frames is obtained via human

annotation. Then, both forward and backward motion vectors are

estimated and compared to decide the initial depth values, and

a compensation process is applied to further improve the depth

initialization. Then, the luminance/chrominance and initial depth

map are decomposed by a Gabor ﬁlter bank. Each subband of

depth is modeled to produce a NSS prior term. The statistical

color–depth priors are combined with the spatial smoothness

constraint in the depth propagation target function as a prior

regularizing term. The ﬁnal depth map associated with each

frame of the input 2D video is optimized by minimizing the

target function over all subbands. In the end, stereoscopic frames

are rendered from the color frames and their associated depth

maps. We evaluated the quality of the generated 3D videos

using both subjective and objective quality assessment methods.

The experimental results obtained on various sequences show

that the presented method outperforms several state-of-the-art

2D-to-3D conversion methods.

Index Terms— 2D-to-3D conversion, depth propagation,

natural scene statistics, Bayesian inference.

I. INTRODUCTION

HREE DIMENSIONAL (3D) video has become quite

popular in recent years. Yet, the proliferation of

3D capture and display devices has not been matched

by a corresponding degree of availability of quality

3D video content. Towards helping to overcome this

3D content shortage, a new 3D content creation technology,

2D-to-3D conversion, is being developed to convert existing

2D videos into 3D videos [1], [2].

Manuscript received October 15, 2013; revised May 30, 2014; accepted

December 17, 2014. Date of publication December 23, 2014; date of current

version January 9, 2015. This work was supported in part by the National

Science Foundation of China under Project 61371166 and Project 61422107,

in part by the Importation and Development of High-Caliber Talents Project

through the Beijing Municipal Institutions under Grant IDHT20130225, in

part by the National Natural Science Foundation of China under Grant

61103130 and Grant 61271435, and in part by the National Program on Key

Basic Research Project (973 Program) under Grant 2010CB731804-1. The

associate editor coordinating the review of this manuscript and approving it

for publication was Prof. Charles Boncelet.

W. Huang and K. Lu are with the College of Engineering and Information

Technology, University of Chinese Academy of Sciences, Beijing 100049,

China (e-mail: luk@ucas.ac.cn).

X. Cao is with the School of Electronic Science and Engineering, Nanjing

University, Nanjing 210093, China (e-mail: caoxun@nju.edu.cn).

Q. Dai is with the Department of Automation, Tsinghua University,

Beijing 100084, China (e-mail: qhdai@tsinghua.edu.cn).

A. C. Bovik is with the Department of Electrical and Computer

Engineering, University of Texas at Austin, Austin, TX 78712 USA (e-mail:

bovik@ece.utexas.edu).

Color versions of one or more of the ﬁgures in this paper are available

online at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TIP.2014.2385474

2D-to-3D video conversion methods can be divided into

two categories, depending on whether human-computer

interactions are involved in the conversion process: fully-

automatic methods and semi-automatic methods [2]. Current

fully-automatic methods are generally only able to deliver

a limited 3D effect. However, semi-automatic methods have

made it possible to balance 3D content quality with production

cost, and has been demonstrated to enable the conversion of

popular old ﬁlms – such as the Star Wars series, Titanic,

and so on into successful cinematic 3D presentations [3]. The

general approach to semi-automatic 2D-to-3D conversion is to

manually or semi-manually create high quality depth maps at

strategically chosen key frames or parts of frames, then propa-

gate depth information from the key frames to non-key frames

to initiate depth calculations at non-key frames (see Fig. 1 as

an illustration). The highest cost arises during the process of

assigning depths to key frames, whereas the 3D quality of

the ﬁnal production largely depends on the accuracy of the

key frame depth maps, the key frame separations, and the

depth propagation method. Smaller intervals and higher key

depth accuracy lay a better foundation for subsequent depth

propagation, leading to improved stereo quality. Unfortunately,

these increase the cost as well.

Developing depth propagation methods that effectively

control depth errors can make it possible to relax the key frame

interval constraints, while also signiﬁcantly improving the

ﬁnal quality. Of course, the additional algorithm complexity

of automation is negligible as compared with the reduction

in human-computer interaction. This is the main reason why

depth propagation plays such a critical role in 2D-to-3D video

conversion.

Recently, statistical models of natural scenes have proven

to provide useful constraints on many image processing and

computer vision problems, including image compression [4],

image and video quality prediction [5], image denoising [6]

and stereo matching [7], [8]. They provide powerful statistical

priors that can force ill-posed visual problems towards stable,

naturalistic solutions. For example, the univariate distributions

of band-pass luminance images (wavelet coefﬁcients) are

well-modeled as obeying a generalized Gaussian distribution:

P(c) =

−|c/s|

Z(s, p)

(1)

where Z(s, p) is a normalizing constant that forces the integral

of P(c) to be 1, while the parameters p, s control the shape

and spread of the distribution, respectively. Liu et al.[7]also

showed that the conditional magnitudes of luminance and

depth are mutually dependent, i.e. regions exhibiting larger

luminance variations often have larger depth variations and

vice versa.

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38531017

粉丝: 8
资源: 916

利用自然场景统计学实现2D到3D转换

【Unity2D】拼图源代码(C#)

Direct3D demo

从财务思维的角度分析民营经济稳步迈向高质量

ai大模型赋能人形机器人,迈向通用人工智能的一大步

MaXM:迈向多语言视觉问答

迈向领航者的你,2022年管理剖析及2023年如何赋能 写一篇发言稿

智能反射使能技术集成和绿色的万物互联网超越 5G:通信、传感和安全与迈向 6G 的联合通信与传感: 使用 MIMO 的模型和潜力的应用场景有哪些区别，都是通信系统与传感器的结合，那有什么不同

人工智能行业从chat-gpt到生成式ai(generative ai)

c4d修神记:零基础到三维封神 下载

电工从入门到精通 全彩图.pdf下载

tcs3200 单片机

excel到python——数据分析进阶指南

2021年rpa行业甲子光年

华为harmonyos开发 java实习

uefi ami 源码下载

帮我写一篇1000字的演讲词，题目是绿色让世界更美好

我是一个计算机专业的大学生，但是毕业没找到工作，但我还想养家，该怎么办

voice over nr 5g

作一首关于香港理工大学的诗歌

各大厂商人工智能模型

最新资源

迈向领航者的你,2022年管理剖析及2023年如何赋能写一篇发言稿

c4d修神记:零基础到三维封神下载

电工从入门到精通全彩图.pdf下载