1556-6013 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIFS.2018.2871748, IEEE
Transactions on Information Forensics and Security
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. *, NO. *, SEPTEMBER 2018 1
Patchwork-based Audio Watermarking Robust
Against De-synchronization and Recapturing
Attacks
Zhenghui Liu, Member, IEEE, Yuankun Huang, Member, IEEE, and Jiwu Huang, Fellow, IEEE
Abstract—Watermarking is a solution for copyright protection
and forensics tracking, but recapturing and de-synchronization
attacks may be used to effectively remove audio watermarks.
Although much effort has been made in recent years, the
robustness of audio watermarking against recapturing and de-
synchronization attacks is still a challenging issue. Specifically,
we first construct the frequency-domain coefficients logarithmic
mean (FDLM) feature of digital audio. By theoretical analy-
sis, we conclude that the residual of the two groups FDLM
feature is robust against recapturing attack. We then propose
a robust audio watermarking method based on this feature
using the patchwork framework. Compared with the method
having the best robustness performance against recapturing
attack, the BER value of our method is decreased by 7%.
Besides that, the proposed method outperforms the state-of-
the-art patchwork-based watermarking methods notably, under
recapturing and post-processed with signal processing operations
and de-synchronization attacks.
Index Terms—Audio watermarking, frequency-domain loga-
rithmic mean, recapturing attack, de-synchronization attacks.
I. INTRODUCTION
W
ATERMARKING has been a solution for copyright
protection, content authentication and ownership ver-
ification, and it has been extensively researched [1,2]. One
of the most important applications of the audio watermarking
is to prevent illicit copying and dissemination of copyrighted
audio. It can provide evidence of copyright infringements after
the copyright violation has occurred. Moreover, it can be
used to trace the fingerprint of pirates and to resolve rightful
ownership. With the proliferation of handheld smartphones,
tablet computers, and voice recorders, general users can easily
recapture copyrighted audio. Watermark information may be
removed or extracted incompletely after recapturing. Thus, it
is possible to disable copyright data extraction from a water-
marked audio track by recapturing. Therefore, dissemination
of unauthorized audio presents a great challenge to audio
watermarking.
Manuscript received December 12, 2017; revised May 3, 2018; accept-
ed September 12, 2018. This work was supported by NSFC (U1636202,
61502409), Shenzhen R&D Program (JCYJ20160328144421330), and also
the Alibaba Group through Alibaba Innovative Research Program. (Corre-
sponding author: Jiwu Huang.)
Z. Liu, Y. Huang, and J. Huang are with the Guangdong Key Labo-
ratory of Intelligent Information Processing and Shenzhen Key Laboratory
of Media Security, also National Engineering Laboratory for Big Data
System Computing Technology, College of Information Engineering, Shen-
zhen University, Shenzhen 518060, China (e-mail: zhenghui.liu@163.com,
huangyuankun2016@email.szu.edu.cn, jwhuang@szu.edu.cn).
Z. Liu is also with College of Computer and Information Technology,
Xinyang Normal University, Xinyang 464000, China.
Existing audio watermarking methods can be generally
categorized into time domain and transform domain methods.
Further, time domain methods contain time aligned [3-5] and
echo-based [6-8] methods, and transform domain methods
include spread spectrum [9,10], quantization index modulation
[11], and patchwork [12] methods.
Patchwork [12] is a promising watermarking algorithm that
has shown high robustness against many common attacks, such
as noise addition, filtering, compression, re-quantization, and
re-sampling attacks. The patchwork method shifts the host
statistics majorly in two steps: (1) choose two patches pseudo-
randomly, (2) add a small constant to the sample of one patch,
and subtract the same constant from the sample of another
patch. The detection process starts with the subtraction of
the sample values between the two patches. Yeo et al. [13]
improved the power and robustness of the original patchwork
algorithm. He presented a Modified Patchwork Algorithm
(MPA) for audio watermarking and applied the algorithm not
only in Discrete Cosine Transform (DCT) domain, but also
in Discrete Fourier Transform (DFT) and Discrete Wavelet
Transform (DWT) domain. The algorithm was sustainable
against common signal processing operations.
Kang et al. [14] presented a technique referred to as full
index embedding, which embedded watermark by modifying
the samples in all index sets to reduce the probability of
misdetection. In that paper, embedding strength is adaptively
changed frame-by-frame using psychoacoustic models. The
performance of this method relies on the assumption that
the chosen patches have the same statistical property. The
assumption restricts the application of this method because it is
not always true in practice. To solve the problem, Kalantari et
al. [15] proposed a multiplicative patchwork method by using
the wavelet transform coefficients of the two produced patches
of each host audio. In this method, the two patches of the
audio segment, having comparable statistical characteristics,
are used to embed the watermark. In fact, a sizeable percentage
of audio segments do not satisfy the constraint, and only some
segments are used for watermark embedding. This results in
a considerable number of false watermarks being extracted,
with no information regarding which segments are selected
for watermarking. However, the method does not provide an
effective way to find these segments.
Natgunanathan et al. [16] proposed a patchwork-based wa-
termarking scheme for audio signals. In this method, the host
audio segment is divided into two sub-segments and the DCT
coefficients of the sub-segments are calculated. The authors