Range Extrapolation of Head-Related Transfer
Function using Improved Higher Order Ambisonics
Ling-song Zhou, Chang-chun Bao, Mao-shen Jia and Bing Bu
Speech and Audio Signal Processing Laboratory, School of Electronic Information and Control Engineering,
Beijing University of Technology, Beijing 100124, China
E-mail: lszhou@emails.bjut.edu.cn, {chchbao, jiamaoshen}@bjut.edu.cn, bubing@emails.bjut.edu.cn
Abstract— 3D audio technology based on binaural
reproduction requires the Head-Related Transfer Function
(HRTF) datasets to be available for all possible distance.
However, due to the tedious work of measurement and large
volume of resulting datasets, the HRTF is typically measured
only for sources located at a fixed distance. In this paper, the
concept of virtual loudspeaker arrays is utilized to achieve range
extrapolation of the measured HRTF datasets at a single range.
The virtual loudspeaker is driven by Higher Order Ambisonics
(HOA). Specially, to restrict the near-field effect of HOA, a
compensation method of modified Wiener filter is proposed. The
simulation results indicate that the proposed method provides
effective range extrapolation of HRTF.
Index Terms—Head-Related Transfer Function, range
extrapolation, improved Higher Order Ambisonics
I. INTRODUCTION
3D audio system aims to reproduce a spatial and immersive
auditory scene for one or more listeners via headphones or
loudspeakers. It has been frequently used in virtual auditory
environments (VAEs), consumer electronics and multimedia
entertainment, etc. The spatial perception via headphones is
typically realized by binaural reproduction using Head-
Related Transfer Function (HRTF). HRTF represents the
acoustic filtering process of a sound source to the left and
right ears, and characterizes the scattering properties of
human’s anatomy (especially the pinnae, head and torso).
HRTF are usually measured in anechoic environments. In
order to limit the measurement work, HRTF are typically
measured on a circle or a spherical surface with a fixed radius
and a finite set of directions. A lot of methods have been
proved for angular interpolation of HRTF, we will not review
these methods here since such interpolation methods are
beyond the scope of this paper. We concern on the range
extrapolation which considers the problem on computing
HRTF at different distances.
Typical distances of the currently available datasets are in
the range of 1.5 to 3 meters. It is generally known that a
source distance more than 3m away from the listener does not
affect the characteristics of the HRTF essentially [1].
However, for near-field sources, HRTF changes significantly
according the source distance. Such HRTF is generally
termed as near-field HRTF, which is the interest of this paper.
A lot of work has been performed to compute near-field
HRTF from HRTF measured at a fixed distance but the most
appropriate extrapolation can still be considered as an open
question. In [2], multi-pole expansion (ME) expresses HRTF
into a series of multi-pole solution about the Helmholtz
equation. The weights of ME are fitted by a Regularized Least
Squared (RLS) technique. However, RLS involves a matrix
inversion. If the matrix is ill-conditioned, the solution may not
exist. The reciprocity principle and modal expansion of the
acoustic wave equation is introduced to extrapolate the HRTF
for all range domains in [3], but the extrapolation is only
possible if no objects are located within the space between the
sources and human subjects. The former two methods are
both based upon the harmonics expansion of HRTF, another
class of methods adopt the concept of virtual loudspeaker
arrays. The acoustic filtering process from a circular or
spherical distribution of sources to the ears is characterized by
the measured HRTF. The distribution of these sources can be
treated as a virtual loudspeaker array. While each virtual
loudspeaker is driven by an appropriate signal to synthesize a
target virtual source, HRTF from the virtual source to the left
and right ears can be synthesized. In [4] and [5], Spors et al
introduced Wave Field Synthesis (WFS) for the derivation of
the virtual loudspeaker driving signals. However, the work of
[6] shows that the performance of WFS is insufficient for a
near-field source reproduction.
This paper will also follow the concept of virtual
loudspeaker arrays. However, Higher Order Ambisonics
(HOA) is used to derivate the virtual loudspeaker driving
signals. HOA will exhibit near field effect when it reproduces
a near-field source. A modified Wiener filter is proposed to
compensate the near field effect in this paper. A number of
benefits of the proposed method will be illustrated in the
following sections.
II. I
MPROVED HIGHER ORDER AMBISONICS
A. Basic Principles of HOA
The HOA system adopts cylindrical/spherical harmonics as
a means to represent and reproduce a 2D/3D sound field in
free space [7]. For simplicity, we focus on the reproduction of
2D or height invariant sound field where the sources are
situated at the horizontal plane.
In cylindrical coordinates, the acoustic pressure
perturbation P(x,k) propagating in a homogeneous medium is
given by the Helmholtz equation in the frequency domain [8]
978-616-361-823-8 © 2014 APSIPA