A new pooled angular spectrum for multiple acoustic sources
localization
Chen Chun-zeng, Zhao Zhao*, Xu Jia-xin, Xu Zhi-yong
Department of Electronic Engineering
School of Electronic and Optical Engineering
Nanjing University of Science and Technology
Nanjing 210094, China
zhaozhao@njust.edu.cn
Keywords: angular spectrum, DOA estimation, histogram, acoustic source localization
Abstract. In this paper, we present a new pooled angular spectrum (P-AS) for estimating the
directions of arrival (DOAs) of multiple acoustic sources. This method uses widely spaced
microphone array for high resolution and only pools the angle components dominated by active
sources in a 2-dimentional time marginal angular spectrum (TM-AS). Its robustness to both
spatial aliasing and short-duration sources is evaluated with different numbers of sources
involving a source active only within few time frames. Experimental results show that the
proposed approach is more effective and robust compared to most existing
angular-spectrum-based DOA estimation methods.
Introduction
With the increasing availability of low-cost computational power over the last few decades,
more and more research efforts have been dedicated to developing sophisticated signal
processing strategies for microphone arrays. And within many important fields including blind
signal separation (BSS), video conferencing and passive acoustic detection for surveillance, the
application of microphone arrays for the localization of sound sources is of particular interest.
Among the existing passive source localization approaches, methods based on time
difference of arrival (TDOA) have attracted much attention. The generalized cross-correlation
with phase transform[1] (GCC-PHAT) is the most popular method. Nonetheless, its
performance is restricted by the relatively high sidelobes in multi-source and multi-path
propagation cases. Based on the concept that the sources are sparse and approximate W-disjoint
orthogonality (W-DO) in time-frequency domain, several approaches[2-3] with the degenerate
unmixing estimation technique (DUET) have been proposed via directly calculation of the
inter-microphone phase difference (IPD) and the inter-microphone amplitude difference (IAD)
of the microphone pairs. However, these methods are based on the mixing model which is
restricted to an anechoic mixture. And when the delay between the two microphone readings is
larger than a sample, the phase unwrapping ambiguities due to spatial aliasing will arise. The
technique named DEMIX[4] has been developed to overcome the intrinsic ambiguities of phase
unwrapping by only clustering different IADs. Nevertheless, when the sources are in the
far-field the clustering process may fail due to small IADs which come from additive noise.
For multiple acoustic sources in a reverberant environment, DOA estimation can be achieved
by iteratively estimating the time-frequency bins associated with each source and clustering the
corresponding TDOAs[5]. These methods can be used for any microphone spacing but are
sensitive to the parameter initialization of the clusters. Other approaches, on the other hand,