深度神经网络在低信噪比环境下的语音增强研究

66 浏览量更新于2024-08-27 收藏 609KB PDF 举报

"提高深度神经网络在低信噪比环境下的语音增强技术" 在《改善深度神经网络在低信噪比环境中的语音增强》这篇研究论文中，作者们探讨了如何在噪声环境下提高语音的可理解性。他们提出了一种结合语音增强（SE）和语音活动检测（VAD）的联合框架，以解决在恶劣信噪比（SNR）条件下，深度神经网络（DNN）表现不佳的问题。DNN最近已被成功用作语音增强的回归模型，但在某些语音段中，由于噪声能量占主导，可能会导致语音失真，从而性能不尽如人意。该方法分为两步进行： 1. 首先，训练一个基于DNN的VAD模型，用于生成帧级的语音活动标签。VAD是识别语音片段与非语音片段的关键，它可以帮助区分背景噪声和实际的语音信号。通过分析训练集中的帧级SNR信息，模型可以更准确地确定哪些时间段包含语音，哪些时间段是纯噪声。 2. 其次，利用VAD的结果指导DNN进行语音增强。在确定了语音存在的帧后，DNN会针对性地对这些帧进行增强处理，减少噪声对语音的影响。这种方法旨在保留和提升语音的质量，同时避免过度处理噪声部分，减少失真。论文还可能涉及以下方面： - 数据集：研究可能使用了各种噪声环境下的大量有标注语音数据来训练和测试模型，包括不同类型的背景噪声和不同的SNR水平。 - 模型架构：DNN的具体结构可能包括多层感知器、卷积神经网络（CNN）或循环神经网络（RNN），这些网络能够捕捉语音信号的时间序列特性。 - 训练策略：可能采用了正则化、权重衰减或数据增强等技术来防止过拟合，并提高模型在未知噪声条件下的泛化能力。 - 评估指标：可能使用了客观和主观的评估标准，如信噪比改善（SNR improvement）、语音质量指标（PESQ）、主观听觉评估（MOS）等，来衡量模型的性能。 - 实验结果：作者可能对比了他们的方法与其他传统方法或现有DNN模型的性能，展示了在低SNR环境中的显著改进。这篇论文聚焦于在噪声环境中提高语音信号的清晰度，通过集成VAD和DNN，为语音通信和语音识别等应用提供了更好的解决方案。这一研究对于提升语音处理技术在实际环境中的应用效果具有重要意义。

Improving Deep Neural Network Based Speech

Enhancement in Low SNR Environments

Tian Gao

)

,JunDu

,YongXu

, Cong Liu

, Li-Rong Dai

and Chin-Hui Lee

University of Science and Technology of China,

Hefei, Anhui, People’s Republic of China

{gtian09,xuyong62}@mail.ustc.edu.cn, {jundu,lrdai}@ustc.edu.cn

iFlytek Research, iFlytek Co., Ltd., Hefei, Anhui, People’s Republic of China

congliu2@iflytek.com

Georgia Institute of Technology, Atlanta, GA, USA

chl@ece.gatech.edu

Abstract. We propose a joint framework combining speech enhance-

ment (SE) and voice activity detection (VAD) to increase the speech

intelligibility in low signal-noise-ratio (SNR) environments. Deep Neural

Networks (DNN) have recently been successfully adopted as a regression

model in SE. Nonetheless, the performance in harsh environments is not

always satisfactory because the noise energy is often dominating in cer-

tain speech segments causing speech distortion. Based on the analysis

of SNR information at the frame level in the training set, our approach

consists of two steps, namely: (1) a DNN-based VAD model is trained to

generate frame-level speech/non-speech probabilities; and (2) the ﬁnal

enhanced speech features are obtained by a weighted sum of the esti-

mated clean speech features processed by incorporating VAD informa-

tion. Experimental results demonstrate that the proposed SE approach

eﬀectively improves short-time objective intelligibility (STOI) by 0.161

and perceptual evaluation of speech quality (PESQ) by 0.333 over the

already-good SE baseline systems at −5dB SNR of babble noise.

Keywords: Speech enhancement

· Low SNR · Deep neural networks ·

Voice activity detection · Speech intelligibility

1 Introduction

Speech enhancement (SE) has been an open research problem for the past several

decades. Many approaches are developed to solve this problem, and they can be

classiﬁed into two categories, namely unsupervised and supervised methods. As

for the unsupervised approaches, there are, spectral subtraction [1], MMSE-

based log-spectral amplitude estimator [2] and optimally modiﬁed log-MMSE

This work was supported by the National Natural Science Foundation of China under

Grants No. 61305002. We would like to thank iFLYTEK Research for providing the

training data and DNN training platform.

 Springer International Publishing Switzerland 2015

E. Vincent et al. (Eds.): LVA/ICA 2015, LNCS 9237, pp. 75–82, 2015.

DOI: 10.1007/978-3-319-22482-4

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38717156

粉丝: 4
资源: 887

深度神经网络在低信噪比环境下的语音增强研究

吴恩达Improving Deep Neural Networks

吴恩达课程 Improving Deep Neural Networks 作业支持文件

吴恩达 improving Deep Network 课程第三周作业原始文件 和满分答案

Eye feature point detection based on single convolutional neural network

Prediction of Market Demand Based on AdaBoost_BP Neural Network

Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification

N-fold Superposition: Improving Neural Networks by Reducing the Noise in Feature Maps

The Affection Management of Massive Traffic Based on MEC and Network Slicing Method for Improving 5G Core Network Performance

Deep Neural Networks Hyperparameter tuning, Regularization and Optimization.zip

Improving-Deep-Neural-Networks-Hyperparameter-tuning-Regularization-and-Optimization:我从不断完善的深度神经网络进行编程作业的解决方案

最新资源

吴恩达 improving Deep Network 课程第三周作业原始文件和满分答案