336 CHINESE OPTICS LETTERS / Vol. 5, No. 6 / June 10, 2007
Scalable distributed video coding based on
block SW-SPIHT
Anhong Wang (
ËËË
)
1,2
, Yao Zha o (
)
1
, Zhenfeng Zhu (
ýýý¨¨¨
)
1
, and Hao Wang (
)
1
1
Institute of Information Science, Beijing Jiaotong University, Beijing 100044
2
Taiyuan University of Science and Technology, Taiyuan 030024
Received September 5, 2006
Nowadays, distributed source coding (DSC) and distributed video coding (DVC) have been receiving more
and more attention due to the distinct contributions to the easy encoding. At the same time, with more
new requirements coming forth in the current network communication, the scalability of bit stream has
been a new focus in the real applications. A scalable DVC scheme is presented without requiring layered
coding in which the main attributions of D VC, namely the capabilities of easy encoding and robustness,
are inherited remarkably and the property of scalability is also integrated simultaneously. Based on the
block Slepian-Wolf set partitioning in hierarchical trees (SW-SPIHT), the Wyner-Ziv frames are enco ded
to get the scalable bit stream. In addition, the binary motion searching is explored at the decoder with
the help of a rate-variable ‘hash’ from the encoder to improve the performance of the whole system. The
final experimental results show that our system has higher peak signal-to-noise ratio (PSNR) than the
pixel-domain DVC at the high bit rate. What is more, the scalability in signal-to-noise ratio (SNR) is also
achieved satisfactorily.
OCIS codes: 100.2000, 040.7290, 330.7310, 100.7410.
Presently, the easy encoding is required by the friendly
up-linking multimedia services. Conventional MPEG
and H.26
∗
cannot meet this need because of the complex
motion estimation at the enco der. Based on the Slepian-
Wolf
[1]
and Wyner-Ziv
[2]
theories, which have set solid
foundation for easy encoding, distributed source coding
(DSC) and distributed video coding (DVC) have shown
great potential and achieved almost the same coding
performance by exploiting dependences between sources
at the decoder. Since then, lots of related works have
been put forward. In Ref. [3], a syndrome-based PRISM
scheme was proposed. The similar scheme taken by Anne
and Girod can be referred to Ref. [4]. Based on the works
in Refs. [3,4], some improvements have been exploited as
shown in Refs. [5,6]. But unluckily, the aforementioned
strategies only show DVC’s efficiency in the view of easy
encoding and robustness without considering the scala-
bility of bit stream.
As a matter of fact, the scalability of bit stream has
been considered as a crux in many real applications,
for example, a set o f heterogeneous mobile receivers
may have various computational and display capabilities
and/or channel capacities. However, only some tentative
schemes have been propos ed for scalable DVC, such as
Refs. [7—9]. And these schemes are all built on a lay-
ered video framework, in which one standar d video cod-
ing scheme is treated as the base layer. Particularly, the
non-complete intra-frame encoding with motion estima-
tion at the base layer is still adopted, which will bring
some negative influences inevitably on the property of
easy encoding at the encoder. In addition to this, the
demerit of fragility to the lossy channel at the base layer
is distinctly obvious because of the prediction shift in mo-
tion compensation.
In this paper, we will give more considerations to the
scalable DVC and try to preserve the properties of easy
encoding and robustness. A complete intra-frame en-
coding model based on the block Slepian-Wolf set par-
titioning in hierarchical trees (SW-SPIHT) is proposed
for Wyner-Ziv frames. Similar to SPIHT, the block SW-
SPIHT is provided with the embedded bit stream. And
this embedded bit strea m can possess more flexibly trun-
cated rates than that in the layered coding. Enlightened
by Ref. [10], which has applied SW-SPIHT to distributed
hyp erspectral imagery successfully and shown better per-
formance than intra-frame SPIHT, we extend the idea of
SW-SPIHT to wavelet block and develop a block SW-
SPIHT technique. Additionally, a binary motion search-
ing (BMS) at deco der with rate-adaptive ‘hash’ is pro-
posed for block SW-SPIHT to improve the performance
of the whole system. The rate-adaptive ‘hash’ in our case
is based on some parity bits from a rate compatible chan-
nel coding, which is different from the fixed-rate ‘hash’ in
Ref. [11]. Moreover, the complete intra-frame encoding
takes on property of robustness. What we should note is
that the ‘hash’ here refers to a kind of encoding-related
information representation. Once sent to decoder, those
assistant information contained in ‘hash’ can be expected
reliably to b e great helpful for motion searching at de-
co der.
The propo sed scalable DVC for Wyner-Ziv frame is
showninFig.1,inwhichtheevenframeX
2i
is the
Wyner-Ziv frame and the odd frames X
2i−1
and X
2i+1
act as the key frames. To the key frames, the conven-
tional SPIHT can b e used, while to the Wyner-Ziv frame,
the coding process is based on the following steps.
1) Intraframe encoding.
Step 1. Generating the wavelet blocks (WBs). The
module of ‘Generating WBs’ refers to rearranging the
wavelet coefficients to form cross-scale wavelet block
(WB) as shown in Fig. 2. That is to say, a 3-scale (it
can be extended to multi-scale easily) discrete wavelet
1671-7694/2007/060336-04
c
2007 Chinese Optics Letters