声源定位：预加白转向响应功率的权衡方法

70 浏览量更新于2024-08-28 收藏 583KB PDF 举报

"权衡预加白的转向响应功率方法用于声源定位" 本文主要探讨了一种创新的声源定位技术，特别是在复杂环境如噪音和混响条件下的应用。声源定位是音频处理中的一个重要课题，它涉及到识别和确定声音来源的位置。在这一领域，转向响应功率（Steered Response Power, SRP）方法是一种常用的技术，它通过分析麦克风阵列接收到的声音信号来确定声源的方向。传统SRP方法虽然有效，但在某些情况下可能会受到噪声和环境因素的影响。为了改善这一情况，作者提出了一种新的SRP方法，引入了权衡预加白（Trade-off Prewhitening）的概念。预加白是一种信号预处理技术，目的是消除信号中的频率依赖性，使不同频率成分具有相同的权重，从而提高后续处理的稳定性。在本文中，研究人员利用语音振幅谱的稀疏性来构建一个凸约束线性预测模型。稀疏性是指在特定表示下，信号大部分元素接近零，只有少数元素非零。这种特性在语音信号中很常见，因为语音通常由有限数量的音素组成。通过这个模型，可以更准确地预测和分离声音信号，从而提高定位精度。为了实现预加白和SRP的有效结合，文章提出了相变预滤波方法。这种方法能够调整和优化信号的频域特性，以达到最佳的定位性能。通过这种方式，新方法在保持传统SRP优点的同时，减少了噪声和混响环境对定位效果的干扰，实现了两者的良好折衷。在实验部分，该方法在多种环境条件下进行了验证，包括噪声和混响环境。结果表明，提出的方法相比于传统SRP，具有显著的优越性，提高了声源定位的准确性和鲁棒性。这对于实际应用，如语音识别、机器人导航、会议音频系统以及声学监控等领域具有重要的价值。这篇研究论文提供了一种改进的声源定位技术，通过权衡预加白和转向响应功率，解决了在复杂环境下的定位难题。这种方法的创新性和实用性使其在学术界和工业界都具有广阔的应用前景。

A steered response powe r approach with trade-off prewhitening

for acoustic source localizat ion

Hongsen He,

Xueyuan Wang, Yingyue Zhou, and Tao Yang

School of Information Engineering and Robot Technology Used for Special Environment Key Laboratory

of Sichuan Province, Southwest University of Science and Technology, Mianyang, 621010, China

(Received 12 September 2017; revised 31 December 2017; accepted 25 January 2018; published

online 16 February 2018)

This paper proposes a steered response power (SRP) approach with trade-off prewhitening to

acoustic source localization. To obtain effective compromise preﬁltering of microphone signals, the

sparsity of speech amplitude spectrum is used to establish a convex-constraint linear prediction

model, which is solved by a split Bregman method. The presented approach uniﬁes the traditional

SRP and steered response power via phase transform preﬁltering methods and achieves a good

compromise between them from the perspective of localization performance. The superiority of the

proposed method is demonstrated in noisy and reverberant environments.

2018 Acoustical Society of America. https://doi.org/10.1121/1.5024652

[KGS] Pages: 1003–1007

I. INTRODUCTION

Acoustic source localization, which is to estimate the

position coordinates or direction of arrivals (DOAs) of sound

sources, is critical in most acoustic applications such as

sonar detection, hands-free voice communication, human-

computer interface, and industrial damage detection systems.

Microphone arrays serve as the spatial aperture needed to

process the auditory scene and yield source location esti-

mates. In acoustic source localization techniques based on

microphone arrays, the strategy based upon maximizing the

steered response power (SRP) of a beamformer

is an impor-

tant source localization approach. It has been experimentally

proved that the SRP technique is immune to noise, but sensi-

tive to reverberation.

To improve the robustness of SRP in room acoustic

environments, the phase transform (PHAT) preﬁltering

has

been applied before computing the cross-correlations, and so

the resulting algorithm, which is termed as steered response

power-phase transform (SRP-PHAT),

1,3

obtains the immu-

nity to reverberation since the PHAT weighting whitens

microphone signals to equally emphasize all frequencies.

To promote the real-time operation of SRP-PHAT, an

inverse mapping method,

which transforms three-

dimensional candidate locations into one-dimensional rela-

tive delays, and a modiﬁed SRP-PHAT method with scalable

spatial sampling

are presented to reduce the computational

cost, respectively. To further enhance the spatial resolution

of SRP-PHAT, an extended strategy based on an iterative

grid decomposition procedure

and a geomet rically sampled

grid method

are also proposed from a grid search perspec-

tive, respectively. The localization performance of SRP-

PHAT, however, degenerates under noisy conditions.

In a recent work, the sparsity of the coefﬁcient vector

of a linear predictor is used to construct an ‘

=‘

-norm

optimization model to prewhiten microphone signals for

time delay estimation (TDE).

The sparsity penalty gives

rise to an effective compromise of TDE performance

between noise and reverberation. In this work, we propose

an alternative sparse linear prediction model to prewhiten

microphone signals for acoustic source localization rather

than TDE. We introduce the sparsity of speech spectrum

to the least-squares criterion to form a mixed norm optimi-

zation model, which is solved by a split Bregman method.

The prediction error signals are then used to establish a

trade-off prewhitening based steered response power

(TOP-SRP) estimator to measure the DOA of a sound

source. This new means uniﬁes the SRP and SRP-PHAT

methods from a DOA estimation performance perspective.

The effectiveness of the developed algorithm is validated

in noisy and reverberant environments.

II. ACOU STIC SOURCE LOCALIZATION VIA TOP-SRP

A. Optimization model

Assume that there is a broadband sound source in the far

ﬁeld which radiates a plane wave. A microphone array with

M elements is exploited to capture the sound signals. We

employ a linear predictor to preﬁlter microphone signals for

acoustic source localization. To this end, we use the past

samples of channel m ðm ¼ 1; 2; …; MÞ to predict its current

sample x(n) as follows:

xðnÞ¼

k¼1

xðn  kÞþeðnÞ; (1)

where a

; k ¼ 1; 2; …; K, are prediction coefﬁcients, K is the

length of the predictor, and e (n ) is the prediction error. Note

that we have dropped the subscript m for the simplicity of

notation. In a vector/matrix form, the signal in Eq. (1) can be

written as

xðnÞ¼X ðnÞa þ eðnÞ; (2)

Also at: State Key Laboratory of Acoustics, Institute of Acoustics, Chinese

Academy of Sciences, Beijing 100190, China. Electronic mail:

hongsenhe@gmail.com

J. Acoust. Soc. Am. 143 (2), Febr uary 2018

2018 Acoustical Society of America 10030001-4966/2018/143(2)/1003/5/$30.00

下载后可阅读完整内容，剩余4页未读，立即下载

不善言辞的我

粉丝: 258
资源: 921

声源定位：预加白转向响应功率的权衡方法

随机信号分析模型在MATLAB与Visual C++中的实现

利用 ESP8266 控制 ws2812 七彩灯：阵列误差容错方法与天猫精灵集成

AR谱估计算法仿真教程及实例分析

6930p独显加白名单

C#调制信号加白噪声的快速傅里叶变换及曲线绘制

正弦余弦以及白噪声叠加后做FFT变换.rar_正余弦加白噪声做FFT变换_正弦函数

随机信号的功率谱估计

cloudflare目前找到的ip名单，可用于设置ip白名单以免网站访问受限

正交相移键控 (QPSK) MoDem：此代码用于通过正交相移键控过程调制和解调载波。-matlab开发

java实现图片写入高清字体及带边框的方法

最新资源