稀疏交错视图与视差驱动的3D全息图像高效编码方法

158 浏览量更新于2024-07-14 收藏 1.92MB PDF 举报

本文是一篇研究论文，主要探讨了在3D全息成像领域中的一个创新编码方法，标题为"使用稀疏隔行扫描视图图像集和视差的3D全息图像的可伸缩编码"。3D全息成像技术，如积分成像、光场成像或plenoptic成像，能够提供自然且无疲劳的三维视觉体验。全息相机捕获的场景包含空间和角度信息，使得可以从不同视角渲染出视图图像。论文的核心目标是利用3D全息内容之间的高度空间相关性来设计一种高效的可伸缩编码方案。首先，作者提出了一种策略，将全息内容重新排列形成一个稀疏隔行扫描视图图像集。这种组织方式能有效捕捉到相邻视图之间的结构重复，从而在编码过程中节省存储空间。接着，一种新型的压缩格式被提议来表示这些隔行扫描的视图图像，利用其稀疏特性来减小数据量。通过基于视差映射的分层和插值技术，论文提出了一种图像重构过程，确保了在保持图像质量的同时实现高效的压缩。这种编码方法的优势在于它能够在不牺牲图像细节的前提下，根据需求进行灵活的分辨率调整，即所谓的可伸缩性。这对于在不同设备上显示全息内容，如手机、头戴式显示器或者高分辨率大屏幕，具有重要意义。在实际应用中，这种技术可能有助于减少带宽需求，提升传输效率，并降低终端设备的处理负担。总结来说，这篇论文在3D全息图像处理领域提出了一个创新的编码框架，结合稀疏表示和视差信息，旨在为3D全息内容的高效传输和展示提供一种新的解决方案，对于提升3D视觉体验和推动全息技术的实用化具有重要的理论价值和实践意义。

Scalable Coding of 3D Holoscopic Image by Using a Sparse Interlaced View Image Set and Disparity Map 3

However, in the above mentioned coding schemes, such high spatial correlation among the rendered view

images is not fully explored. Although some compression schemes are proposed to decompose the holoscopic

contents into multi-view sequences, MVC standard is simply utilized to reduce redundancy. Moreover, such

coding schemes can not provide a scalable coding of holoscopic contents. Regarding to the 3D holoscopic image

scalable coding method, a scalable coding method by using the rendered views as prediction references is pro-

posed in [22]. However, the inter-layer prediction process in [22] is based on the hypothesis that the disparities

of all the adjacent EI are approximatively equal, which is not always proper. Additionally, the coding bit rate

of the reference image is not included in the ﬁnal bit stream, which may inﬂuence the decoded holoscopic image

quality.

In order to improve the coding eﬃciency by exploiting the existing correlation among the rendered view

images as well as provide coding scalability, a scalable coding scheme by using a sparse interlaced view image

set and disparity map is proposed in this paper. To descript the spatial correlation among the rendered view

images clearly, we prop ose to use the interlaced view image to represent the 3D holoscopic content. Some re-

dundancy of neighboring VIs in interlaced view image is ﬁrstly removed by using a sparse interlaced view image

set and corresponding disparity map before encoding. Then, based on the sparse interlaced view image set and

disparity map, a full interlaced view image can be reconstructed by using sifting with simple interpolation. The

reconstructed interlaced view image is ﬁnally utilized as a reference to predict the original interlaced view im-

age with a modiﬁed HEVC encoder. The proposed scalable coding method has a three-layer structure. Spatial

resolution scalability can be provided from ﬁrst to second layer, and from second to the third layer, quality

scalability is available. The main contributions of this paper are: 1) interlaced view image is used to represent

the 3D holoscopic content to exploit the high spatial correlation among the rendered view images; 2) a sparse

interlaced view image and corresponding disparity map are used to coding the interlaced view image; 3) coding

scalability is enable in the proposed coding scheme. Note that this work is limited to the compression of the 3D

holoscopic images captured by the Plenoptic Camera 2.0 [26].

This paper is organized as follows. The common view rendering methods are illustrated in Section 2. The

proposed scalable coding method is described in Section 3. Experimental results are presented and analyzed in

Section 4, while the concluding remarks are given in Section 5.

2 View image rendering

The light rays emanating from the 3D scene can be expressed by using the complete seven dimensional

parametrization plenoptic function which is introduced by Adelson and Bergen [23]:

I = P

(x, y, z, θ, ϕ, λ, t) (1)

where (x, y, z) is the viewing position, (θ, ϕ) is the light ray directions, λ is the light ray wavelength and t

is the time. If we assume that the 3D scene is a static scene and the color is represented by RGB channels,

the plenoptic function can be reduced to ﬁve dimensions without λ and t. Moreover, if the regions are free

of occluders, the plenoptic function can be further simpliﬁed into four dimensions [24][25], which deﬁnes a

light ray by the coordinates of its intersections with two parallel planes. This means that the 3D holoscopic

image captures both spatial and angular information of a 3D scene. Therefore, VIs can be rendered from a 3D

holoscopic image, where the VIs represent the orthographic projections of the captured 3D scene in diﬀerent

directions.

The simplest way to construct a single view image is to extract one pixel with the same relative position

from each EI of a given 3D holoscopic image and then stitch them together. However, extracting only one pixel

from each EI results in disappointingly low resolution and the rendered view image suﬀers from severely blocky

artifacts. Another common rendering method is to construct a VI by extracting a patch from each EI [26]. The

rendering process is shown in Fig.2. Suppose that a P × P patch is extracted from each EI of size n

× n

with

the same relative position. With N

× N

EIs in 3D holoscopic image, the ﬁnal rendered view image is of size

P · N

× P · N

. By extracting a patch in each EI, the resolution of the rendered VI can be improved. However,

some artifacts still likely appear on part of the rendered view by using a ﬁxed patch size [26].

剩余14页未读，继续阅读

weixin_38668225

粉丝: 2

稀疏交错视图与视差驱动的3D全息图像高效编码方法

使用HEVC的基于视差补偿的3D全息图像编码

列表视图单元格图像视差效果视图iOS

有效视差图像分割与重组的单步全息体视图打印方法

【图像配准】双目视觉图像匹配（含视差图 3D视图）【含Matlab源码 4601期】.zip

【图像配准】基于matlab双目视觉图像匹配（含视差图 3D视图）【含Matlab源码 4601期】.mp4

【图像配准】基于matlab双目视觉图像匹配（含视差图 3D视图）【含Matlab仿真 4601期】.zip

【图像配准】基于matlab双目视觉图像匹配（含视差图 3D视图）【含Matlab源码 4601期】.md

disparity图像匹配生成视差图像

MPSkewed:一个 iOS 集合视图子类，用于呈现倾斜图像列表和一个集合视图布局，用于为图像添加视差

基于序列视差图像的全息立体显示方法

最新资源