新型多声源定位的合并角谱算法：高效鲁棒DOA估计

99 浏览量更新于2024-08-26 1 收藏 305KB PDF 举报

本文探讨了用于多个声源定位的新型集合角谱（P-AS）方法，这是一种创新的技术，旨在提高声源到达方向（DOA）估计的精度和鲁棒性。作者们来自南京理工大学电子工程学院，他们设计了一种广泛应用于高分辨率场景的宽间隔麦克风阵列。P-AS的独特之处在于，它聚焦于二维时间边际角频谱（TM-AS）中的活跃声源角度分量，有效地避免了空间混叠效应，这是传统方法中常见的问题。相比于其他基于角谱的DOA估计方法，P-AS方法能够更好地处理短时持续的声源，这是因为它的设计策略允许对活跃源的时间范围进行精确筛选，减少了噪声和误识别的可能性。在实验部分，研究者通过对比不同数量的声源，评估了P-AS对空间混叠和短暂声源的抵抗能力，结果显示，当只有一个或少数声源在短时间内激活时，P-AS的表现优于现有技术，显示出更高的准确性和稳定性。此外，引入P-AS是基于当前计算能力的提升，这使得处理复杂的声源环境成为可能。在实际应用中，如音频信号处理、音频增强或者声纳系统中，这种高效且鲁棒的DOA估计方法有望提升系统的性能，特别是在需要实时定位和跟踪多个动态声源的场景中。总结来说，这篇论文提出了一个创新的声源定位技术，它结合了宽间隔麦克风阵列的优势和对活跃源角度分量的有效合并，显著提高了DOA估计的性能。对于那些依赖精确声源定位的应用，如语音识别、音频源分离或者音频增强，P-AS方法具有显著的优势，有望成为未来的研究热点。

A new pooled angular spectrum for multiple acoustic sources

localization

Chen Chun-zeng, Zhao Zhao*, Xu Jia-xin, Xu Zhi-yong

Department of Electronic Engineering

School of Electronic and Optical Engineering

Nanjing University of Science and Technology

Nanjing 210094, China

zhaozhao@njust.edu.cn

Keywords: angular spectrum, DOA estimation, histogram, acoustic source localization

Abstract. In this paper, we present a new pooled angular spectrum (P-AS) for estimating the

directions of arrival (DOAs) of multiple acoustic sources. This method uses widely spaced

microphone array for high resolution and only pools the angle components dominated by active

sources in a 2-dimentional time marginal angular spectrum (TM-AS). Its robustness to both

spatial aliasing and short-duration sources is evaluated with different numbers of sources

involving a source active only within few time frames. Experimental results show that the

proposed approach is more effective and robust compared to most existing

angular-spectrum-based DOA estimation methods.

Introduction

With the increasing availability of low-cost computational power over the last few decades,

more and more research efforts have been dedicated to developing sophisticated signal

processing strategies for microphone arrays. And within many important fields including blind

signal separation (BSS), video conferencing and passive acoustic detection for surveillance, the

application of microphone arrays for the localization of sound sources is of particular interest.

Among the existing passive source localization approaches, methods based on time

difference of arrival (TDOA) have attracted much attention. The generalized cross-correlation

with phase transform[1] (GCC-PHAT) is the most popular method. Nonetheless, its

performance is restricted by the relatively high sidelobes in multi-source and multi-path

propagation cases. Based on the concept that the sources are sparse and approximate W-disjoint

orthogonality (W-DO) in time-frequency domain, several approaches[2-3] with the degenerate

unmixing estimation technique (DUET) have been proposed via directly calculation of the

inter-microphone phase difference (IPD) and the inter-microphone amplitude difference (IAD)

of the microphone pairs. However, these methods are based on the mixing model which is

restricted to an anechoic mixture. And when the delay between the two microphone readings is

larger than a sample, the phase unwrapping ambiguities due to spatial aliasing will arise. The

technique named DEMIX[4] has been developed to overcome the intrinsic ambiguities of phase

unwrapping by only clustering different IADs. Nevertheless, when the sources are in the

far-field the clustering process may fail due to small IADs which come from additive noise.

For multiple acoustic sources in a reverberant environment, DOA estimation can be achieved

by iteratively estimating the time-frequency bins associated with each source and clustering the

corresponding TDOAs[5]. These methods can be used for any microphone spacing but are

sensitive to the parameter initialization of the clusters. Other approaches, on the other hand,

下载后可阅读完整内容，剩余5页未读，立即下载

weixin_38571104

粉丝: 3
资源: 944

新型多声源定位的合并角谱算法：高效鲁棒DOA估计

信号传播速度未知下基于运动单站到达频率的定位新方法.docx

等边三角麦克风阵列的多声源语音分离技术

可视化GCC-PHAT定位算法的doasvm-visualizer工具介绍

3D音频的多通道对象空间参数压缩方法

Unity中声音的立体声定位与环境音效处理

音频分析新手起步指南：Soundcheck 19在声音工程中的关键应用

多通道信号处理技术：多通道信号处理方法及应用

MATLAB中的多通道滤波技术

多天线系统中的空时信号处理技术

构建多用户协作应用：Nreal MR眼镜开发指南

最新资源