互动3-DTV：概念、关键技术与深度渲染

4星 · 超过85%的资源需积分: 3 64 浏览量更新于2024-08-02 收藏 2.43MB PDF 举报

《Fehn_互动式3-DTV概念与关键技术》是一篇深入探讨立体电视领域的重要论文，由Christoph Fehn、René Delabarré和Siegmund Pastoor撰写，作为特邀文章发表。该文旨在概述近年来三维电视（3-DTV）的发展趋势，特别是聚焦于互动性和灵活性的提升，超越了传统的双视立体影像技术。文章的核心内容围绕新型3-D数据表示格式展开，这些格式天然具备交互性，允许实时生成多个“虚拟”视角，增强了用户体验。研究者介绍了一种基于单色视频和像素级深度信息联合分布的实验性3-D TV系统。这种系统利用深度图像渲染（DIBR）技术，能够在接收端，即3-D电视机顶盒中实时合成真实世界的场景，实现多视角观看体验。此外，论文详细讨论了无眼镜（自动立体）3-D TV显示技术的最新进展，包括针对单用户和多用户的解决方案。对于单用户，这涵盖了如何通过先进的显示技术，如视差贴图或者逐像素处理，让用户在不佩戴特殊眼镜的情况下也能享受立体效果。而对于多用户环境，文章探讨了基于头部追踪、目光定位或手势控制的多元模型用户界面设计，旨在提供更加自然和无缝的交互方式。关键词包括（自动）立体显示器、编码与广播传输、深度图像处理等，这些技术的发展对提升3-D TV的沉浸感和可用性至关重要。《Fehn_互动式3-DTV概念与关键技术》这篇论文不仅介绍了3-D TV技术的前沿理论，还展示了其在实际应用中的创新实践，为行业未来的发展提供了有价值的参考。

Fig. 2. 3-DTV signal processing and data transmission chain consisting

of ﬁve functional building blocks: 1) 3-D content creation; 2) 3-D video

coding; 3) transmission; 4) “virtual” view synthesis; 5) 3-D display.

Fig. 3. Functionality of the Zcam active range camera. (a) An infrared light

wall is emitted by the camera. (b) The reﬂected light wall carries an imprint

of the captured 3-D scene. (c) The 3-D information is extracted by blocking

the remaining incoming light with a very fast shutter (from [24]).

supplementary depth-images can be compressed using any

of the newer, more efﬁcient additions to the MPEG family

of standards such as MPEG-4 Visual [22] or the latest

Advanced Video Coding (H.264/AVC) [23].

To allow for an easier understanding of the fundamental

ideas, the envisioned signal processing and data transmission

chain of the outlined 3-DTV concept is illustrated in Fig. 2.

It consists of ﬁve functional building blocks: 1) 3-D content

creation; 2) 3-D video coding; 3) transmission; 4) “virtual”

view synthesis; and 5) 3-D display.

A. 3-D Content Creation

A number of approaches are applicable for the creation

of 3-D content. In one very appealing scenario, novel 3-D

material is generated by simultaneously capturing video and

associated per-pixel depth information with an active range

camera such as the Zcam developed by 3DV Systems, Ltd.

[24] or the NHK Axi-vision HDTV camera [25]. These de-

vices integrate a high-speed pulsed infrared light source into

a conventional broadcast TV camera, and they relate the time

of ﬂight of the emitted and reﬂected light walls to direct

measurements of the depth characteristics of the 3-D scene

(Fig. 3).

The main drawback of current 3-D cameras is the fact

that they are only ﬁt for indoor use in studio environments

and that they are not able to record more than relatively

small-scale scenes (up to a few meters of depth). Thus,

alternative approaches are required for the generation of

3-D data for larger scale, outdoor scenes. Here, the most

promising concepts are based on the simultaneous capturing

of multiview data using either traditional stereo cameras or

synchronized multicamera systems (Fig. 4). Given several

images of the spatial scenery, the 3-D geometry can be

reconstructed by applying techniques from computer vision

Fig. 4. The Penn State multicamera system. A cluster of up to six ﬁrewire

cameras is used to generate depth information of a human participant in an

immersive telepresence application (from [28]).

(CV) and photogrammetry [15], [26]–[28]. In general, most

existing methods involve ﬁve basic steps: 1) geometric and

photometric calibration of the individual cameras; 2) estima-

tion of geometrical relations between the different views; 3)

an optical ﬂow or correlation-based search for corresponding

points in two or more image planes; 4) localization of the

corresponding 3-D space points; and 5) integration of the

entire depth information into one or more camera reference

frames.

Even with these novel “3-D capture” technologies at hand,

it seems clear that the need for sufﬁcient high-quality 3-D

content can only partially be satisﬁed with new recordings.

It will therefore be necessary—especially in the introductory

phase of the new 3-DTV technology—to also convert already

existing 2-D video material into 3-D using so-called “struc-

ture from motion” algorithms. On principle, such (ofﬂine or

online) methods process one or more monoscopic color video

sequences to: 1) establish a dense set of image point cor-

respondences from which information about the recording

camera as well as the 3-D structure of the scene can be de-

rived [26], [27], [29], [30] or 2) infer approximate depth

information from the relative movements of automatically

tracked image segments [31].

B. “Virtual” View Synthesis

DIBR is deﬁned as the process of synthesizing “virtual”

views of a real-world scene from still or moving images and

associated per-pixel depth information [32], [33]. Conceptu-

ally, this novel view generation method can be understood as

a two-step procedure: At ﬁrst, the original image points are

reprojected into the 3-D world, utilizing the respective depth

values. Thereafter, these intermediate space points are pro-

jected into the image plane of a “virtual” camera located at

the required viewing position. The concatenation of reprojec-

tion (2-D to 3-D) and subsequent projection (3-D to 2-D) is

usually referred to as “3-D image warping” in the computer

graphics (CG) literature.

1) The “Virtual” Stereo Camera: Building on the de-

scribed 3-D image warping concept, the synthesis of stereo-

scopic images can be realized through the deﬁnition of two

“virtual” cameras—one for the left-eye and one for the right-

eye. With respect to the original (reference) view, these two

cameras are symmetrically displaced by half the interaxial

distance

(Fig. 5). To establish the zero parallax setting

(ZPS), i.e., to choose the convergence distance

in the 3-D

526 PROCEEDINGS OF THE IEEE, VOL. 94, NO. 3, MARCH 2006

剩余14页未读，继续阅读

yexiaoya

粉丝: 19
资源: 17

互动3-DTV：概念、关键技术与深度渲染

WWCDC-FEHN-FTGuide:WWCDC 前端黑客之夜初学者指南

Nsauditor Network Security Auditor v1.7.5

SpringBoot集成Spring security 2024.10(Spring Security 6.3.3)

2003-202年全国地级市城镇化率和城镇登记失业率-最新出炉.zip

2020年全国第七次人口普查分县资料和县域年鉴-最新出炉.zip

2022省统计年鉴更新了！_2001-2022年各省合集（15省更新至2022）-最新出炉.zip

Matlab实现三角测量拓扑聚合优化器TTAO-CNN-BiLSTM-Mutilhead-Attention多变量预测.rar

Matlab实现黑猩猩优化算法Chimp-CNN-BiLSTM-Mutilhead-Attention时序预测算法研究.rar

【SCI一区】Matlab实现向量加权平均算法INFO-CNN-LSTM-Attention的风电功率预测算法研究.rar

【高创新】基于鹈鹕优化算法POA-CNN-BiLSTM-Attention的用客流量预测算法研究Matlab实现.rar

最新资源