Published in IET Signal Processing
Received on 18th February 2011
Revised on 14th November 2011
doi: 10.1049/iet-spr.2011.0062
ISSN 1751-9675
Depth map compression and depth-aided view
rendering for a three-dimensional video system
F. Shao M. Yu G. Jiang F. Li Z. Peng
Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, People’s Republic of China
E-mail: shaofeng@nbu.edu.cn
Abstract: Three-dimensional (3D) video technologies are becoming increasingly popular, as they can provide high quality and
immersive experience to end users, where depth maps are employed to generate the virtual views by depth-image-based rendering
technique. However, how to reduce the compression and rendering complexities for depth maps while maintaining high rendering
quality is still unresolved. In this study, a novel depth map compression and depth-aided view rendering method is proposed. In
the proposed method, depth maps are represented with different layers and compressed with di fferent macroblock-mode decision
procedure, and several optimisation techniques, including spatio-temporal consistent warping, colour correction and temporal
consistent hole filling are embed ded into the view rendering framework. Experimental results show that compared with the
traditional method, the proposed method can reduce more than 79% compression computational complexity and more than
45% rendering computational complexity, while maintaining high rendering quality.
1 Introduction
Three-dimensional (3D) video has gained popularity with its
ability to give viewers an enhanced experience of multimedia
in comparison to traditional two-dimensional (2D) video.
With these features, 3D video (3DV) will revolutionise
visual media by providing 3D television (3DTV) and free
viewpoint television (FTV) applications [1, 2]. In order to
promote the 3DTV and FTV applications, relevant
problems such as capturing, pre-processing, coding and
rendering of 3DV data are now very active research topic.
In order to represent 3D scene, different 3DV formats were
proposed, among which multi-view video plus depth (MVD)
format was recommended by MPEG of ISO/IEC and VCEG
of ITU-T because of its flexible representation and
compatibility with the existing compression and
transmission technologies [3]. Since MVD representation
causes a huge amount of data to be stored or transmitted to
the user, it is essential to develop efficient coding
techniques. Multi-view video coding (MVC) had been
widely researched [4]. Instead of directly using the MVC
technique, some totally different compression methods were
proposed for depth maps. Morvan et al. [5] concentrated on
depth smooth properties, and proposed quadtree
decomposition scheme to model those regions. Oh et al. [6]
proposed a depth boundary reconstruction filter and utilised
it as an in-loop filter to compress the depth map.
Furthermore, by considering the joint characteristics from
MVD representation, some fast MVC methods were
proposed by sharing the same macroblock (MB) mode or
motion vector information between colour videos and depth
maps [7, 8]. However, besides the compression efficiency,
the compression complexity and the effect of depth
distortion on view rendering quality are also important
issues to be solved in depth map compression.
In MVD representation, virtual view can be rendered from
MVD data by using depth image-based rendering (DIBR)
technique [9]. Since accurate depth map acquisition is still
an unsolved problem, DIBR requires solving high rendering
quality because of the inaccurate depth information. Many
depth pre-processing and depth post-processing methods
were proposed. Lai et al. [10] proposed iterative joint
multilateral filtering to process the estimated depth maps,
and aimed to align their edges with those in video frames
and to reduce false contours. Ekmekcioglu et al. [11]
proposed content adaptive filters for different depth map
regions to enforce consistency across the spatial, temporal
and inter-view dimensions of depth maps. Lee and Effendi
[12] proposed an adaptive edge-oriented smoothing filter to
deal with the problems of hole occurrences, geometric
distortions and computational complexity in DIBR.
However, even though these methods can eliminate the
influence of inaccurate depth information, how to improve
the rendering quality in DIBR is still a very interesting
problem in 3DV research.
From anothe r perspective, in order to improve the view
rendering quality in DIBR, many optimisation methods
were proposed. Bulbul et al. [13] proposed a perceptually
based approach to improve the view rendering quality by
utilising binocular suppression mechanism in the human
visual system. Do et al. [14] proposed supersampling
technique to reduce the warping errors for obtaining higher
view rendering quality. Zhao et al. [15] proposed a novel
solution of suppression of misalignment and alignment
enforcement between colour videos and depth maps to
reduce boundary artefact. Tech et al. [16] evaluated the
IET Signal Process., 2012, Vol. 6, Iss. 3, pp. 247 –254 247
doi: 10.1049/iet-spr.2011.0062
&
The Institution of Engineering and Technology 2012
www.ietdl.org