ORIGINAL RESEARCH PAPER
Adaptive mode decision for multiview video coding based
on macroblock position constraint model
Yue Li
1
•
Gaobo Yang
1
•
Yapei Zhu
2
•
Can Liu
2
•
Kai Liu
3
Received: 8 April 2015 / Accepted: 15 August 2015
Ó Springer-Verlag Berlin Heidelberg 2015
Abstract Multiview video coding (MVC) exploits mode
decision, motion estimation and disp arity estimation to
achieve high compression ratio, which results in an
extensive computational complexity. This paper presents
an efficient mode decision approach for MVC using a
macroblock (MB) position constraint model (MPCM). The
proposed approach reduces the number of candi date modes
by utilizing the mode correlation and rate distortion cost
(RD cost) in the previously encoded frames/views.
Specifically, the mode correlations both in the temporal-
spatial domain and the inter-view are modeled with
MPCM. Then, MPCM is exploited to select the optimal
prediction direction for the current encoding MB. Finally,
the inter mode is early determined in the optimal prediction
direction. Experimental results show that the proposed
method can save 86.03 % of encoding time compared with
the exhaustive mode decision used in the reference soft-
ware of joint multiview video coding, with only 0.077 dB
loss in Bjontegaard delta peak signal-to-noise ratio
(BDPSNR) and 2.29 % increment of the total Bjontegaard
delta bit rate (BDBR), which is superior to the perfor-
mances of state-of-the-art approaches.
Keywords Multiview video coding Mode decision
Macroblock position constraint model H.264/AVC
1 Introduction
Multi-view video refers to a set of temporally synchronized
videos captured at the same scene by multiple cameras
from different viewpoints [1]. Compared with the single-
view video, multi-view video provides more interactivity
and realistic experience for viewers, which has great
potential in new video applications such as Free-viewpoint
Television (FTV) and Three-dimensional Television
(3DTV) [2, 3]. To facilitate the research of multi-view
video coding (MVC), Joint Video Team (JVT), which was
composed of experts from both ISO/IEC MPEG and ITU-T
Video Coding Experts Grou p (VCEG) [4, 5], developed
reference software of Joint Multiview Video Coding
(JMVC) on the basis of H.264/AVC video coding standard.
In JMVC, hierarchical B picture (HBP) structure achieves
higher coding efficiency compared with the straightforward
solution of independently encoding each view with H.264/
AVC. Figure 1 shows the HBP architecture in JMVC,
where the arrow denotes the direction of reference frame.
All the views are divided into two classes: even views and
odd views. The even views (V0, V2, V4 and V6) use
variable block-size motion estimation (ME) technique to
exploit the spatial-temporal correlation. Meanwhile the odd
views (V1, V3, V5 and V7) adopt a new variable block-
size disparity estimation (DE) technique which exploits the
inter-view correlation to improve the coding efficiency.
Because the process of ME and DE is separately and
repeatedly performed for each MB, the computational
complexity of mode decision is very intensive.
To reduce the computational complexity, several fast
mode decision approaches are presented in the literature.
They can be categorized into two classes. The first class is
to early terminate the SKIP/DIRECT mode decision pro-
cess [6–8]. If the SKIP/DIRECT mode is considered as the
& Gaobo Yang
yanggaobo@hnu.edu.cn
1
School of Information Science and Engineering, Hunan
University, Changsha 410082, China
2
Faculty of Physics and Electronic Information Science,
Hengyang Normal University, Hengyang 421002, China
3
North University of China, Taiyuan 030051, China
123
J Real-Time Image Proc
DOI 10.1007/s11554-015-0527-1