倒谱技术：在恶劣环境中的语音增强方法

127 浏览量更新于2024-08-27 收藏 385KB PDF 举报

"这篇研究论文提出了一种基于倒谱的语音增强预处理和后处理算法，旨在改善在恶劣环境中的语音清晰度。该方法通过减少语音对噪声功率谱密度估计的影响来避免过估计噪声，并能抑制非平稳噪声和音乐噪声，同时避免引入可闻的语音失真。" 在语音处理领域，尤其是在噪音环境中进行语音增强是一个关键问题。这篇论文"基于倒谱的预处理和后处理用于恶劣环境中的语音增强"提出了一种创新的解决方案。首先，让我们深入了解倒谱分析（Cepstrum Analysis）这一核心技术。倒谱分析是一种信号处理技术，它通过对频谱的对数取傅立叶逆变换来获取信号的倒谱表示。在语音处理中，倒谱可以揭示语音的基本结构，特别是谐波特性，这使得它在噪声抑制和语音识别等方面具有优势。论文中提到的预处理步骤专注于降低语音对噪声估计的影响。在单通道语音增强中，准确估计噪声功率谱密度（NPSD）是关键，因为过度估计噪声可能会导致语音的损失。通过倒谱预处理，论文提出的方法能够消除语音中的谐波成分，从而更准确地跟踪非平稳噪声。这种方法有助于防止在噪声估计过程中误将语音当作噪声处理。接下来，后处理阶段的目的是进一步净化增强后的语音信号。论文采用倒谱后处理策略，可以有效地抑制那些依然存在的非平稳噪声成分以及恼人的音乐噪声（musical noise）。音乐噪声是指在噪声抑制过程中产生的不自然的、类似音符的噪声，通常在低信噪比环境下尤为明显。通过精细的倒谱后处理，算法能在不引入可听的语音失真的情况下减少这些噪声，从而提高语音的质量和可理解性。实验结果证明了所提算法的有效性，表明它能够在不利的环境中显著提升语音的清晰度。这种基于倒谱的方法为语音增强提供了新的思路，尤其是在应对不断变化的噪声环境时，能够提供更稳定的性能。这篇研究论文对语音处理社区贡献了一种新的、基于倒谱的预处理和后处理技术，这将有助于在各种嘈杂环境下提升语音通信的质量。这一方法对于开发更智能的语音识别系统、助听设备以及语音通信应用等领域具有重要的理论与实践意义。

Technical Note

A cepstrum-based preprocessing and postprocessing for

speech enhancement in adverse environments

Xiaohu Hu, Shiwei Wang, Chengshi Zheng

⇑

, Xiaodong Li

Communication Acoustics Laboratory, Institute of Acoustics, Chinese Academy of Sciences, 100190 Beijing, China

article info

Article history:

Received 10 April 2013

Received in revised form 23 May 2013

Accepted 5 June 2013

Keywords:

Cepstral analysis

Speech enhancement

Noise estimation

abstract

This paper proposes a cepstrum-based preprocessing and postprocessing algorithm for single-channel

speech enhancement. The cepstrum-based preprocessing scheme is applied to reduce the impact of

the voiced speech on estimating the noise power spectral density (NPSD), which results in avoiding over-

estimating the NPSD by eliminating harmonic components of the voiced speech when tracking non-sta-

tionary noise components. The cepstrum-based postprocessing scheme is used to suppress both some

non-stationary noise components and the annoying musical noise without introducing audible speech dis-

tortion. Experimental results show that the proposed algorithm could track non-stationary noise effec-

tively without overestimating the NPSD. Moreover, the proposed algorithm achieves better

performance in terms of both the segmental signal-to-noise-ratio improvement and the PESQ

improvement.

1. Introduction

In single-channel speech enhancement systems, it is well-

known that there are two open problems for spectral subtraction

[1,2]. One is how to estimate the noise power spectral density

(NPSD) in adverse environments, the other is how to suppress

the non-stationary noise components effectively even when the

NPSD is severely underestimated. Researchers have made great

efforts to solve these two problems during the last four decades

[3–9,11–19].

It is a non-trivial task to estimate the NPSD from the noisy

speech, especially when the noise is extremely non-stationary.

Generally, there are two categories of algorithms in estimating

the NPSD. One is updating the NPSD in noise-only segments, where

an accurate voice activity detection (VAD) algorithm is often

needed and important [2]. The other could update the NPSD in

not only non-speech segments but also speech segments, and this

non-VAD algorithm is more attractive and popular for its capability

of tracking NPSD in speech segments [4–9,11]. Recently, lots of

algorithms have been proposed to track non-stationary noise. Mar-

tin proposed the well-known minimum statistics (MS) method,

which could track decreasing noise levels immediately while it

has a large delay in tracking increasing noise levels [4]. Cohen pro-

posed the minima controlled recursive averaging (MCRA) method

to improve the tracking capability of the MS method [5]. Both

the MS method and the MCRA method were further improved by

Rangachari and Loizou [6].In[8], Hendriks et al. proposed a low-

complexity MMSE estimator of the NPSD. A relative complete eval-

uation of these NPSD methods can be found in [11].

It is an inevitable problem that the NPSD is often underesti-

mated in adverse environments for single-channel speech

enhancement systems. The residual noise may become more

unpleasant to the ear when only the stationary noise components

are totally suppressed, which is due to that the dynamic range of

the noise becomes larger than before [20]. To make the residual

noise sound natural, numerous algorithms have been proposed in

the last two decades. Some researchers suggested to preserve a

certain amount of background noise, where this scheme could sim-

ply reduce the dynamic range of the residual noise. In [13,14], the

auditory masking properties were applied to suppress more non-

stationary noise components. Breithaupt et al. used the cepstral

smoothing technique to suppress both the musical noise and some

non-stationary noise components [15–17]. Wang et al. proposed to

use the modiﬁed cepstrum thresholding (MCT) technique to

achieve the same objective [18].

In this paper, we propose a new scheme to improve the tracking

capability of the existing NPSD estimation methods, where this

scheme is based on the fact that the voiced speech often lasts a

long time. A cepstrum-based preprocessing scheme is proposed

to suppress the harmonic components of the voiced speech before

estimating the NPSD, where this scheme is somewhat motivated

by recent works in analyzing the theoretical properties of cepstral

coefﬁcients [21–24]. Experimental results verify that the proposed

cepstrum-based preprocessing scheme could track the non-sta-

tionary noise and avoid overestimating the NPSD simultaneously.

http://dx.doi.org/10.1016/j.apacoust.2013.06.001

⇑

Corresponding author. Tel.: +86 10 82547945; fax: +86 10 62553898.

E-mail address: cszheng@mail.ioa.ac.cn (C. Zheng).

Applied Acoustics 74 (2013) 1458–1462

Contents lists available at SciVerse ScienceDirect

Applied Acoustics

journal homepage: www.elsevier.com/locate/apacoust

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38532849

粉丝: 7
资源: 952

倒谱技术：在恶劣环境中的语音增强方法

基于倒谱预处理技术的语音增强算法研究

基于倒谱图判断浊音的基音周期MATLAB仿真 语音信号处理

倒谱预处理在语音增强中的应用与效果

语音特征提取技术解析：从预处理到倒谱分析

yuyin_suggestwsr_复倒谱_语音信号处理倒谱分析.zip

基于matlab实现语音信号的预处理

SR3.rar_倒谱_倒谱语音信号_语音信号_语音信号处理

MATLAB实现语音信号的预处理【语音信号处理实战】.zip

语音信号的预处理和特征提取技术PDF+预处理部分matlab代码

zll.rar_倒谱_语音信号的恢复_语音倒谱提取

最新资源

基于倒谱图判断浊音的基音周期MATLAB仿真语音信号处理