Java vs. Symbian: A Comparison
of
Software-based DSR
Implementations on Mobile Phones
Dmitry Zaykovskiy and Alexander Schmitt
Institute
of
Information Technology, University
ofUlm,
Ulm, Germany
{dmitry.zaykovskiy,
alexander.schmitt}@uni-ulm.de
Abstract
With increasing processing power
of
today's mobile
phones, a reasonable employment
of
Distributed Speech
Recognition (DSR) Technology comes into reach. This pa-
per presents the ETSI DSR front-end standards as software-
based implementations on the two most popular mobile
phone platforms: Java ME and Symbian.
We
present per-
formance comparisons between the different front-end stan-
dards as well as the execution times on current mobile
phones. After showing that real-time feature-extraction on
current devices is possible, we furthermore point out hin-
drances and pitfalls during development and deployment.
Keywords: Java ME, Symbian, mobile phones, feature
extraction, distributed speech recognition (DSR).
also restricted to a very limited quantity
of
words. With a
growing number
of
entries in the phone book, the recog-
nition rate suffers severely due to built-in low-cost proces-
sors, limited storage and RAM. With this embedded strat-
egy applications such as SMS dictation or the use
of
nat-
ural language are far out
of
reach for the near future. The
most vividly discussed proposal to overcome this challenge
is the principle
of
Distributed Speech Recognition. In this
approach, the speech recognition process is separated into
two parts: a
front-end on the client-side and a back-end on
the server-side.
The front-end extracts characteristic features out
of
the
speech signal, whereas the back-end, making use
of
the lan-
guage and acoustic models performs the computationally
costly recognition.
1 Introduction
Figure
1:
Architecture
of
a client-server DSR system.
2 Related Work
Figure 1 shows a system architecture for DSR. The client
captures the speech signal using a microphone and extracts
features out
of
the signal. The features are compressed in
order to obtain low data rates and transmitted to the server.
At the server back-end, the features are decompressed and
subjected to the actual recognition process.
Bit-Stream Decoding
and
Error Mitigation
Bit-stream
Server
Client
Recognit~on
Result:
While there have been a number
of
studies on the the-
ory
of
DSR, little research has been done on deploying
of
this technology to real mobile devices. The authors
of
[1] proposed a modified version
of
the widespread ETSI
FE-standard for DSR, implemented as a hardware-based
front-end solution for a Motorola Digital Signal Processor
(DSP). A software-based implementation
of
the ETSI AFE-
standard for PDAs using Sphinx-4 [10] as speech recog-
• the number
of
recognizable words is very limited,
• usually the words have to be recorded by the user be-
forehand,
• the recognition system
is
speaker dependent.
Some recent devices spare the user the necessity to pre-
record the commands and feature speaker-independent, em-
bedded speech recognition. This functionality, however, is
The days are numbered where we used our mobile phones
exclusively for telephone conversation. Today we have ac-
cess to thousands
of
different applications and services for
our mobile companions and their number is rapidly grow-
ing. Although the devices have stopped getting smaller and
smaller, we see ourselves confronted with a limited user
interface consisting
of
tiny keys and a miniature display,
which suffices for making phone calls, but which is un-
suited to control applications. The most promising solution
to this challenge
is
the use
of
speech recognition.
If
we take a glance at modem mobile phones, we indeed
discover basic speech recognition functionality: most mo-
bile devices on the market support voice control features
such as voice dialling or hands-free commands. Instead
of
searching names in the telephone book, the user can dictate
the name
of
a person he wants
to
call or, on some devices,
use a voice command to launch a particularphone function.
Although this technology points out new ways to improve
user interfaces on mobile phones, it still has severallimita-
tions:
Authorized licensed use limited to: IEEE Xplore. Downloaded on March 17, 2009 at 15:53 from IEEE Xplore. Restrictions apply.