Abstract—With the background of human and service robot
interaction indoor, a kind of auditory system of mobile robot
based on Kinect sensor is designed to enhance the intelligent
level of mobile robot and improve the interactive experience
between human and robot. In virtue of the auditory system,
operators can use voice commands to control the mobile robot
for particular actions. In consideration of the effective angle
range of voice-pickup and the factors that affecting the
recognition rate of the auditory system, the function of mobile
robot that adjusting its own position and poster according to the
sound source orientation by auditory system was setup to ensure
the effect of man-robot voice interaction indoor. The
performance of the sound source-tracking auditory system and
motion control function based on speech recognition of robot
was verified in actual indoor environment.
I. INTRODUCTION
long with the rapid development of robot and intelligent
technology, the natural communication between robot
and human draws much of people’s attentions [1]. Voice
as the most natural, and convenient interactive mode of the
human societyˈmay in the place of the traditional ways to
become the most important human-robot interactive method
in the future [2]. When robots and human communicating with
each other via voice, the navigation problem of robot and
human can be taken into the same framework named sound
source positioning. As a device of body feeling peripherals
developed by Microsoft, beside the infrared projector and
color cameraˈKinect is also equipped with a microphone
array at the bottom of the sensor. The four Microphones
forming the array is asymmetric distributed, and composed
into a linear array structure. Since the asymmetric structureˈ
the sound signals gained by Microphones can be used for
speech recognition and the sound source localization [3].
Considering the limited scope of pickup angle and the acoustic
sensor noise indoor [4], satisfactory human-robot voice
This work was supported by the national natural science fund project
(61305101), & Hebei province natural science fund project
(F2014202121&F2010000137).
S. P. Wang., was with Hebei University of Technology, Tianjin, CHN. He
is now with the School of Control Science and Engineering, HEBUT (e-mail:
wangsp_hebut@163.com).
P. Yang., was with Hebei University of Technology, Tianjin, CHN. He is
now with the School of Control Science and Engineering, HEBUT (e-mail:
yphebut@163.com).
H. Sun., was with Hebei University of Technology, Tianjin, CHN. He is
now with the School of Control Science and Engineering, HEBUT
(corresponding author to provide phone: +8613821177105; e-mail:
sunhao@hebut.edu.cn).
interaction not only relays on effective speech recognition
technology, the robot to adjust its own position and pose
according to the azimuth information of the operator who give
out the voice commands timely is also needed. Target
localization is of great significance for robot perceiving
information in the surrounding environment, and improving
the adaptive ability, and which is the key areas of mobile robot
research [5]. In the field of robot target localization, visual
perception technology is an important research direction.
However, in the unstructured environment indoor, since the
uncertainty of illumination intensity and obstacles, visual
perception may face great difficult to provide satisfactory
positioning result
[6], [7]. Inspired by biological target
location method via combination of seeing and hearing,
joining the visual perception technology and auditory
technology together may remedy the limitations of visual
perception in the special environment. Currently, study of
vocal target in indoor environment has been gradually carried
out, such as the literature [8] which analyzed sound source
target location problem from auditory scene method; The
literature [9] studied the space source goal orientation method
based on the microphone array and applied it to the speaker
localization. Based on all of above, this paper designs an
auditory system of mobile robot based on Kinect sensor to
explore the feasibility of human-robot voice interaction in the
indoor environment and the existing problems.
II. T
HE AUDITORY SYSTEM OF ROBOT
The robot auditory system is mainly composed of a
Microsoft's Kinect device, an industrial personal computer of
Advantech and a self-designed differential drive type moving
platform. The Kinect and the industrial computer is installed
on the mobile robot platform. The structure of the robot
system is shown as figure 1 below.
Part ĉ of figure 1 is the Kinect sensor unit, which is used
for obtaining voice information of target in the auditory
system. As shown in part Ċ, the computer is the main control
unit of the auditory system, on which relevant software will
run for voice information processing, and the underlying
driver module will be controlled to perform the corresponding
action. Part ċ in figure 1 is the driving part, which adopts the
differential two-wheels drive motor control module based on
DSP, and matched with two auxiliary wheels to improve the
movement of the mobile robot.
Design and Implementation of Auditory System for Mobile Robot
Based on Kinect Sensor
Shuopeng Wang, Peng Yang, and Hao Sun
2016 12th World Congress on Intelligent Control and Automation (WCICA)
June 12-15, 2016, Guilin, China
978-1-4673-8414-8/16/$31.00 ©2016 IEEE 3094