基于RPCA与频峰操控的语音水印技术

171 浏览量更新于2024-08-27 收藏 817KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本文探讨了一种基于Robust Principal Component Analysis (RPCA)和声门调制的语音水印嵌入方法。语音信号的频谱图具有相对稀疏的特性，这使得通过RPCA可以有效地提取核心语音信息，即使在噪声或干扰环境下，也能更精确地利用线性预测（Linear Prediction, LP）估计声门，从而显著提高水印方法的鲁棒性。RPCA在此过程中的关键作用是通过对语音信号进行降维和分解，区分出信号的重要部分和潜在的噪声成分。声门，也称为线性频谱频率（Line Spectral Frequencies, LSFs），在语音信号中起着决定音调和发音清晰度的关键作用。通过控制和操纵这些频率，研究者能够在保持语音可听性的前提下，将水印信息嵌入到声门特征中。这种嵌入方式依赖于一个稳定的、鲁棒的参数设置，能够确保水印在各种条件下仍能保持隐藏并能被准确地读取和验证。具体步骤可能包括以下几点： 1. **信号预处理**：首先对原始语音信号进行采样和转换，将其转换为频域表示（如短时傅立叶变换STFT），以便利用RPCA提取关键特征。 2. **RPCA分解**：使用RPCA技术，将频谱矩阵分解为一个低秩矩阵（包含语音信号的结构信息）和一个稀疏矩阵（代表噪声或干扰）。这样可以分离出主要的信号成分，提高后续处理的精度。 3. **声门检测与提取**：根据RPCA的结果，利用线性预测模型精确计算声门参数，即LSFs。这些参数反映了语音的谐波结构，是嵌入水印的重要载体。 4. **水印设计与嵌入**：设计一个有效的水印编码方案，将其转化为对LSFs的微小调整。这种调整既要确保水印的嵌入不影响语音的质量，又要能够抵抗常见的信号处理攻击。 5. **水印验证**：为了确保水印的完整性，设计一种解码算法，可以从处理后的LSFs中恢复原始的水印信息，并验证其正确性。 6. **鲁棒性分析**：实验评估了该方法在不同噪声环境、信噪比以及语音变换条件下的表现，证明了其在保持语音信号自然的同时，具备良好的抗攻击能力。这项研究提出了一种创新的策略，结合了RPCA的稳健性与声门控制，为语音信号的水印保护提供了一种有效且可靠的手段。这不仅有助于保护知识产权，也为未来的音频数据安全和隐藏通信提供了新的研究方向。

资源详情

资源推荐

SPEECH WATERMARKING BASED ON ROBUST PRINCIPAL COMPONENT ANALYSIS

AND FORMANT MANIPULATIONS

Shengbei WANG, Weitao YUAN, Jianming WANG

∗

School of Computer Science

& Software Engineering

Tianjin Polytechnic University

Binshuixi Road, Xiqing District, Tianjin, China

Masashi UNOKI

†

School Information Science

Japan Advanced Institute of Science

and Technology

1-1 Asahidai, Nomi, Ishikawa, Japan

ABSTRACT

This paper proposes a watermarking method for speech signals

based on Robust Principal Component Analysis (RPCA) and for-

mant manipulations. As the spectrogram of speech has a rela-

tively sparse structure, the core information of speech is extracted

into a sparse matrix using RPCA so that formants can be esti-

mated with Linear Prediction (LP) more accurately even under

noise/interferences, which signiﬁcantly improves the robustness of

proposed method. We investigate how the formants can be con-

trolled and manipulated to make the watermarking method effective.

Watermarks are embedded into speech by controlling the shape and

power of formants using the stable and robust parameter, i.e., line

spectral frequencies (LSFs). Evaluations regarding inaudibility and

robustness are carried out and the results suggest that the proposed

method can not only satisfy inaudibility but also provide good ro-

bustness against general processing and different speech codecs

which is better than the other methods.

Index Terms— Robust principal component analysis, Linear

prediction, Formant, Line spectral frequencies, Robustness

1. INTRODUCTION

Speech signal is an important information carrier in many social ap-

plications such as WeChat and GoogleTalk. However, modern digi-

tal technologies have put the security of speech at risk. Watermark-

ing is a promising solution to protect speech signals. A general

watermarking should be inaudible to human perception, blind for

watermark extraction, and robust against signal processing/codecs.

However, there is a trade-off among these competitive requirements,

e.g., robustness is usually improved at the expense of inaudibility,

and vice versa. Therefore, how to realize desired watermarking is

still a challenging problem. This work focuses on exploring inaudi-

ble, blind, and robust speech watermarking.

There has been signiﬁcant research into speech watermarking re-

cent years. A typical category of watermarking focuses on exploring

the characteristics of human auditory system (HAS) for inaudibility

[1, 2]. For instance, watermarks can be embedded into the phase

of speech based on fact that HAS is not sensitive to slight phase

modiﬁcations [3, 4]. Quantization index modulation (QIM) [5, 6]

based methods form another category of watermarking, where a lot

∗

Thanks to grant No. 2017KJ089, Natural Science Foundation of Tianjin

(No. 17JCQNJC00100 and No. 16JCYBJC41500), and National Natural

Science Foundation of China (No. 6137104 and No. 61602344) for funding.

†

This work was also supported by a Grant-in-Aid for Scientic Research

(B) (No. 17H01761) and I-D DATA foundation.

of efforts have been devoted to selecting suitable features to balance

inaudibility and robustness. Spread spectrum is a well-known tech-

nique which is widely employed for robust watermarking [7, 8, 9].

Aside from these categories, hybrid watermarking [10, 11, 12] has

been veriﬁed to have superior performance in robustness since wa-

termarks are doubly embedded which enables them to be reliably ex-

tracted. Despite these achievements, many existing methods cannot

reach a balance between inaudibility and robustness. In particular,

robustness against codecs is highly desired for speech watermark-

ing while many methods are not completely robust against different

speech codecs.

A common problem in watermarking ﬁeld is that many meth-

ods can extract the watermarks in ideal situations (without noise/ in-

terferences), but when there are noise/interferences in watermarked

signal, the embedded watermarks will fail to extract which leads to

weak robustness. We previously proposed two formant enhancement

based watermarking methods [13, 14]. However, their robustness

against speech codecs was not satisfactory, e.g., [13] was not robust

against any speech codecs and [14] was not robust against G.729 at

high capacities. This paper proposes a speech watermarking method

based on robust principal component analysis (RPCA) and formant

manipulations. RPCA is employed to extract the core information in

speech so that formants can be estimated correctly even under inter-

ferences caused by speech processing and codecs. Watermarks are

embedded into the formants of relatively low power by controlling

line spectral frequencies (LSFs) to maintain the speech quality. The

main contribution of this paper is that RPCA is introduced to water-

marking for the ﬁrst time and the introducing of RPCA can signiﬁ-

cantly attenuate the inﬂuence of various interferences in watermark

extraction process which improves the robustness. The effectiveness

of proposed method is demonstrated in the experiments.

2. PROPOSED METHOD

Linear Prediction (LP) is popular for separating the vocal tract and

excitation information in the source-ﬁlter model of speech produc-

tion. The LP coefﬁcients derived from LP can provide important

information of acoustic feature, i.e., formants. Nevertheless, when

speech is smeared by interferences such as background noise and re-

verberation, the estimated LP envelope and formants will be much

distorted. As the proposed method embeds watermarks into for-

mants, it is necessary to make sure that formants could be correctly

estimated even under interferences.

In general, speech varies signiﬁcantly and continuously over

time and its power concentrates on formants, thus the spectrogram

of speech has a relatively sparse structure. Based on this fact, some

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38678172

粉丝: 2
资源: 911

基于RPCA与频峰操控的语音水印技术

Robust principal component analysis?

python rpca

为什么L1-PCA模型也属于RPCA，它和基于低秩项的核范数和稀疏项的l1范数的和的RPCA有什么区别

robust PCA MATLAB 代码

parallax-tolerant image stitching based on robust elastic warping

PCA与RPCA的区别是什么，PCA可以进行背景差分实验吗

为什么L1-PCA模型也属于RPCA，它和基于稀疏与地址分解的RPCA有什么区别，L1-PCA模型可以进行背景差分实验吗

低秩矩阵恢复去噪matlab

hinf_robust_analysis

rpca和pca的区别

为什么基于重建误差的l1范数的PCA模型是鲁棒主成分分析，基于低秩项的核范数和稀疏项的l1范数的和的RPCA也是鲁棒主成分分析

RPCA matlab

用numpy写个rpca

Analysis and simulation of synchronization performance of direct sequence spread spectrum system based on matlab

matlab rpca

rpca应用到全色和多光谱融合思路

rpca算法matlab代码

其他cs算法实现压缩感知图像重构

动态前景分离 rpca

weixin007医院管理系统+Springboot.rar

最新资源