978-1-4244-9305-0/11/$26.00 ©2011 IEEE 34
2011 4th International Congress on Image and Signal Processing
a)Ballet_color_S
7
T
4
b) Ballet_color_S
7
T
5
c)FD in A,B
d)Ballet_depth_S
7
T
4
e)Ballet_depth_S
7
T
5
f)FD in D,E
A novel depth spatial-temporal consistency
enhancement algorithm for high compression
performance
Ruiqing Zhang
1,2
, Zongju Peng
1,2
, Mei Yu
1
, Gangyi Jiang
1
, Wei Bi
1,2
1
Faculty of Information Science and Engineering
Ningbo University
Ningbo, China
{yumei,jianggangyi } @nbu.edu.cn
2
Zhejiang Provincial Key Laboratory of Information
Network Technology
Zhejiang University, Hangzhou, China
pengzongju@126.com
Abstract— In free viewpoint video system based on multiview
video plus depth, inconsistency with depth video need to be
eliminated to ensure high-quality virtual view generation and
compression performance. The preprocessing method proposed
can compensate both spatial and temporal depth information
inaccuracy by using Bayesian probability model and Rival
penalized competitive learning in Self-Organizing Maps. Firstly,
each gray value in depth video is assigned to specific class after
clustering. Then gradient filter is utilized in smoothing.
Experiments show that the proposed algorithm reduced the bit
rate ranging 7.97%-46.83% while ensuring quality of generated
virtual viewpoint.
Keywords-Depth video preprocessing; depth spatial-temporal
consistency enhancement; Bayesian Probability Model
I. INTRODUCTION
Free viewpoint video (FVV) is a new type of natural video
media that allows users to freely navigate in the real world
visual scenes. In order to represent 3D scenes, MPEG specified
a standard for efficient compression and transmission [1,2].
This context proposed Multiview Video plus Depth (MVD)
format, consisting of multiple color videos with associated
depth data. Depth maps are utilized in projecting image of
limited number of viewpoints to the image of other viewpoints
through depth image based rendering (DIBR) [3]. In the MVD
based FVV system, the total bitrate yield by Multiview Video
Coding (MVC) is proportional to the number of views [4].
Thus, MVD has huge data compared with monoview video.
Depth video is composed by depth maps which only have
luminance component. Its simpler texture makes it use only
10%-20% of bit rate of the normal color video without
decreasing PSNR [5]. However, disturbance like optical noise
in depth maps constitute inaccuracy in generating high-quality
virtual view and increase bit rate. Therefore, for transmission
on limited bandwidth channel, depth redundancies need to be
exploited. Preprocessing of depth maps is required mainly to
assure high-quality virtual view generation and suppressing
unnecessary details on both temporal and spatial dimension.
Kwanghee Jung had proposed K-Means cluster and smooth
filter [6], setting k to 5 factitiously. Varied by complexity of
sequences however, it is unpractical to define class number as
a constant. For this reason, this contribution presents a novel
depth spatial-temporal consistency enhancement algorithm. In
this algorithm, improved Self-Organizing Maps (SOM) is used
for adaptive depth map cluster. Firstly, Bayesian Probability
Model is used in network competition and Rival penalized
competitive learning (RPCL) [7] is appended to update neural
weight. Subsequently, gradient smoothing is used to enhance
spatial-temporal consistency. In Section II, challenge in pre-
procession and introduction of Self-Organizing Maps are
briefly introduced. In Section III, the application of Bayesian
probability and whole algorithm flow is presented. Section IV
introduces the experiments conducted and its result. Finally,
conclusion is given in Section V.
II. P
ROBLEM DESCRIPTION
In MVD format, depth maps are directly captured by depth
camera or estimated by computer vision algorithm. Gray value
in depth maps denotes the distance from camera image plane
to the 3D point. However, the gray value is not accurate, which
decreases the spatial and temporal correlation of depth video.
Figure 1. Comparison of FD in color and depth sequence