Li et al. / J Zhejiang Univ-Sci C (Comput & Electron) 2012 13(8):624-634
626
The QIM steganography hides the secret data
during the VQ process. If one codeword is selected
then one secret bit can be embedded. Taking the
embedded process based on L
1
as an example, the
CNV-QIM steganography (Xiao et al., 2008) firstly
partitions L
1
into two sub-codebooks
1
1
L and
2
1
L
using the CNV algorithm, where
1
1
L and
2
1
L both
contain |L
1
|/2 vector indices and satisfy
12 12
11 11 1
,.LL LLL (2)
The CNV algorithm can guarantee that each
codeword and its most nearest codeword in L
1
belong
to different sub-codebooks. Thus, the additional sig-
nal distortion caused by QIM embedding would be
minimal in comparison with other division methods.
Upon the completion of partition, labels of ‘0’ and ‘1’
will be assigned to
1
1
and
2
1
, respectively. When a
secret bit is embedded, only the corresponding
sub-codebook is used for codeword selecting. On the
decoding side, the hidden bit is extracted through
checking which sub-codebook the codeword belongs to.
According to the above analysis, there are three
split vector sequences in the encoded speech bit
stream containing N G.723.1 frames. Each split vector
sequence F
i
can be represented as follows:
,1 , ,
, ..., , ...., , 1, 2, 3,
ii ik iN
iFf f f (3)
where f
i,k
(i{1, 2, 3}, k[1, N]) represents the ith
split vector of frame k in the bit stream. After VQ, F
i
will be converted to quantization index sequence
(QIS) S
i
as
,1 , ,
,..., ,...., , 1, 2, 3,
hu m
ii ik iN
Sc c c i (4)
where
,
( {1, 2, 3}, [1, ], [1, | |])
u
ik i
ci k Nu L
is the
quantization index of f
i,k
.
The QIM steganography (Xiao et al., 2008)
embeds the secret bits into the bit stream when f
i,k
chooses the quantization index. As a result of each
frame containing three split vectors, three secret bits
can be hidden in each frame. Obviously, the QIM
steganography will inevitably change the original
quantization result, because the QIM steganography
is able to convert the original quantization index
,
h
ik
c
of f
i,k
into
,
().
u
ik
cuh
Therefore, the original QIS S
i
of F
i
will produce disturbance. Fig. 1 presents an
example of the QIS disturbance. In this example, we
firstly encode a speech segment with a duration of 3 s
according to G.723.1 and obtain the ‘cover’ object.
Secondly, we repeat the encoding process to obtain
the ‘stego’ object using the QIM steganography (Xiao
et al., 2008). We extract the QIS
11,1 1,50
, ..., , ...,
hu
Sc c
1,100
m
c from the encoded bit stream of the ‘cover’ and
‘stego’ objects. We show these two QIS in Fig. 1, and
we can clearly view the difference between the origi-
nal QIS and its steganography version; the QIM
steganography significantly changes the quantization
vector sequence. This disturbance of QIS is probable
to change the distribution characteristics of the quan-
tization index as well. Obviously, if these character-
istics can be quantified then the disturbance in QIS
can be measured. Taking advantage of this informa-
tion, we can detect QIM steganography in G.723.1 bit
stream.
3 Statistical models of quantization index
distribution characteristics
According to acoustics of speech production,
phoneme is the basic unit of human speech and is the
pronunciation of one or several sequential letters
(Thomas, 2002). When a person speaks, he/she con-
tinuously adjusts his/her articulators for a sequence of
Fig. 1 Example of quantization index modulation (QIM)
steganography disturbing the quantization index se-
quence (QIS) of the first split vector
20 40 60 80 100
0
50
100
150
200
250
Sequence number
Quantization index
Stego
Cover
0