2 1. INTRODUCTION
(Moving Pictures Experts Group) audio standards, i.e., MPEG-1 [27] and the MPEG-2 [28]. Fur-
thermore,several successful commercial audio standards have been published including Sony’s Adap-
tive TRansform Acoustic Coding (ATRAC), DTS Coherent Acoustics (DTS-CA) and Dolby’s
Audio Coder-3 (AC-3). Elements or entire algorithms for perceptual coding have also appeared
in [21, 23],[27, 28, 29,30, 31, 32,33, 34, 35,36, 37, 38,39, 41, 42,44, 45,46,47, 48,49,50, 51,52, 53].
With the emergence of surround sound systems, multi-channel encoding formats also gained inter-
est [54, 55, 56].The advent of ISO/IEC MPEG-4 standardization [45, 47] established new research
goals for high-quality coding of general audio signals even at low bit rates. MPEG-4 audio encom-
passes an integrated family of algorithms with wide ranging provisions for scalable, object-based
speech and audio coding at bit rates from 200 bps up to 64 kbps per channel [57, 58].
1.1.1 RECENT AUDIO CODECS
The older MPEG-1 hybrid audio coding technique (ISO/IEC 11172-3) incorporates subband filter
bank decomposition, signal transforms such as the FFT and psychoacoustic analysis. MPEG-1 audio
operates on 16-bit PCM input audio data and accommodates sample rates of 32, 44.1, and 48 kHz.
Operating modes of this algorithm include mono, stereo, dual independent mono, and joint stereo.
The target bit rates are programmable in the range of 32-192 kbits/s for mono and 64-384 kbits/s
for stereo. Despite the fact that MPEG-1 Layer-III (MP3) is still an active and popular standard,
several new algorithms have been shown to perform better. Advanced Audio Coding (AAC) is a
standardized, lossy compression scheme that generally achieves better sound quality than MP3 at
similar bit rates.It has been standardized by the ISO and IEC as part of the MPEG-2 and MPEG-4
standards. Designed as a successor to the MP3 algorithm, AAC allows more sampling frequencies
(8 kHz to 96 kHz) and supports up to 48 channels.
Though perceptual audio coders such as the MP3 and AAC offer reasonably good quality
at bit rates down to 80 kbps, they are associated with an algorithmic delay that exceeds 120 ms.
Applications such as two-way communications or broadcasting require low end-to-end delays of
the order of 20 ms. As a result, Low Delay (LD) audio coding schemes have been developed and
they provide comparable perceptual quality to MP3 or AAC with a very low algorithmic delay. The
MPEG-4 AAC audio coder is used as a basis to build the low delay functionality preferable in
end-to-end applications such as teleconferencing and telephony. Typical bit rates of AAC-LD start
at 32 kbps for a mono signal with 22 kHz sampling rate and reach 128 kbps providing excellent
audio quality [59]. AAC-ELD (Enhanced Low Delay) was standardized as part of MPEG in
January 2008. AAC-ELD has an algorithmic delay of 32 ms at 24 kbps down to 15 ms at 64 kbps.
AAC-ELD combines the advantages of AAC-LD for low encoding/decoding purposes and Spectral
Band Replication (SBR) for preserving high quality at low bit rates. Delay critical applications such
as wideband audio/video conferencing, broadcasting which require high quality audio at low bit
rates can benefit from this scheme [60]. The Ultra Low Delay (ULD) AAC [61] was developed at
Fraunhofer and attains delays of the order of 8 ms.