978-1-4673-0963-9/10/$26.00 ©2012 IEEE 1925
2012 5th International Congress on Image and Signal Processing (CISP 2012)
A Novel Speech Coding Algorithm for Cochlear
Implants
Hongyun LIU Weidong WANG
*
Kaiyuan LI Zhengbo ZHANG
Department of Medical Engineering & Supply Center,
Chinese PLA General Hospital,
Beijing, China
Abstract—Cochlear implants (CI) can restore some degree of
hearing to individuals with severe to profound sensorineural
hearing loss. In recent years, new speech coding algorithms were
developed for improving the performance of cochlear implants,
but sound recognition in noisy environment, tonal language and
music perception remain very difficult for most cochlear implant
users. To enhance speech recognition in noise, as well as tonal
language and music perception, a new speech coding algorithm
called Hilbert Huang Transform Stimulating(HHTS) for cochlear
implants was presented. HHT is a powerful tool which consists of
sifting procedure of empirical mode decomposition (EMD) and
the Hilbert Transform (HT) to analyze non-linear and non-
stationary signal. Instantaneous frequency could be derived from
time-frequency description of speech signal in the sifting
procedure and a lot of information comprised in fine structure is
not only reflection of speech contents, speech rhythms and tones,
but also speakers’ individual characteristics, so that have to get
finer envelope and fine structure properties of speech. HHTS,
continuous interleaved sampling (CIS), channel specific sampling
sequences (CSSS), frequency amplitude modulation encoding
(FAME) strategies were simulated based on MATLAB.
Synthesized stimulus and their spectrum were correlation
analyzed between original signals. Compared to other 3
strategies, HHTS obtain the highest correlation coefficient
between spectrum of synthesized signal and that of original
speech. The spectrum of synthesized signal through HHTS
strategy is the most correlated to that of original speech, and the
correlation is significant.
Keywords-Cochlear implant; Hilbert Huang Transform;
Empirical mode decomposition; Hilbert Transform;
I. INTRODUCTION
Cochlear implants are accepted as the unique medical
device which can restore partial hearing to individuals with
severe to profound sensorineural hearing loss through electric
stimulation of residual auditory nerve. As of December 2010,
approximately 219,000 people worldwide have received
cochlear implants; in the U.S., roughly 42,600 adults and
28,400 children are recipients. Cochlear implants have been
remarkably successful in providing hearing to the profoundly
deaf, and the modern multichannel cochlear implants produce
word recognition scores around 80% for sentences in quiet,
allowing the majority of their users to talk on the phone
fluently [1][2]. However, the speech perception attainable by
cochlear implant users has reached a plateau with current
cochlear implant speech coding strategies. Apart from that,
serious limitations are observed in the representation of speech
in noise, tonal languages and music.
Traditional cochlear implants have generally employed two
types of speech coding algorithms. In one type, only amplitude
or envelope characteristics of original speech are extracted and
modulated a fixed rate biphasic pulse train, such as CIS
strategy [3]. In the other type, band-pass filtered raw analog
processed speech, which contains amplitude, frequency, phase
and fine structure information, are delivered directly to
electrodes to stimulate residual auditory nerve, compressed
analog (CA) is precisely this kind of strategy [4][5][6]. The two
types of strategies mentioned above each have their own
disadvantage. One of them provides too little (amplitude or
envelope modulation only) and the other provides too much
indiscriminable information [2]. During the past few years,
many speech coding algorithms, such as channel specific
sampling (CSSS)[5], wavelet zero-crossing stimulation (WZCS)
[7][8], FAME, asynchronous interleaved sampling (AIS)[9]
algorithms and so on, were presented and researched. Besides
amplitude or envelope information, frequency, phase, fine
structure and other essential components are extracted and
encoded to improve the quality of sound perception for
cochlear implant users in variety of circumstances.
Motivated by literature review and physiological evidence,
a novel speech coding algorithm was proposed to encode phase
or fine structure of original sound in cochlear implants to
improve their perception of noisy speech, tonal languages and
music. In the following sections, we will first present an
algorithm that decomposes a signal into intrinsic mode
functions to obtain slowly varying amplitude and phase or fine
structure characteristics of original speech. We term this
algorithm the Hilbert Huang Transform Stimulating (HHTS)
strategy. Computer simulation was conducted to verify the
algorithm’s accuracy and efficiency from a signal processing
point of view.
II. H
ILBERT HUANG TRANSFORM STIMULATING
ALGORITHM
A. Hilbert Transform
HT is one of the important mathematic tools in the field of
signal analysis and processing. Supposing there is a real