Evaluation of the Relation between Emotional Concepts and
Emotional Parameters in Speech
Tsuyoshi Moriyama, Hideo Saito, and Shinji Ozawa
Department of Information and Computer Science, Keio University, Kanagawa, 223-8522 Japan
SUMMARY
A relation model which relates the physical changes
present in emotion-containing speech to the emotional con -
tent perceived in the speech is proposed. By using statistical
bases extracted from the physical parameters of the speech
and from the associated emotion words rather than the
parameters and words themselves, the model makes it pos -
sible to relate physical changes and emotional content
independently of the choice of variables for consideration.
Also in this model, emotions which are shared among
listeners, or emotion stereotypes, are used as the standard
of judgment instead of the emotion intended by the speaker,
and accordingly the emotions can be assumed to be observ -
able and reproducible. In this study, first, the physical
parameters of several speech samples are calculated and the
emotional content of the same speech samples is obtained
in a psychological experiment, and these data sets are
processed by statistical methods to obtain orthogonal bases.
Next, these bases are related linearly by multiple regression
analysis. As a result, relation information which allows
conversion between the physical parameters and emotional
content of the speech to be performed is obtained. © 2001
Scripta Technica, Syst Comp Jpn, 32(3): 5664, 2001
Key words: Emotional stereotypes; emotional
content; component analysis; emotional speech; emotion
words; prosodic parameters.
1. Introduction
Recently, the remarkable progress of the information
communication infrastructure and the various develop-
ments in multimedia technology have made it possible to
provide many levels and kinds of information communica-
tion services to consumers. The objects in such services
not only perform special or routine tasks, but also extend
their role to that of partners for psychological regeneration
in communication. In order to provide a kind of comfort
through humanmachine communication, it appears that
the objects must be able to understand human factors such
as emotion and also to imitate them. Many investigations
concerned with the emotional information carried by
speech have been performed with such human-friendly
technologies in mind [35, 8, 9]. Almost all of them have
attempted to formulate the relation between physical pa-
rameters and emotional content and to synthesize emotional
expression that can be superimposed on neutral speech, but
their results have been inadequate. Because the psychologi-
cal quantity of emotion depends on such factors as the
cultural background of the participants in communication,
situational context, vocal personality, the meanings of
words, and so on, certain limitations or conditions are
required in order to deal with it by an engineering approach,
and the differences among the viewpoints of the studies are
reflected in such constraints. In past studies, constraints of
this kind have rarely been investigated, leading to difficulty
in clarifying the relation between the results of different
studies.
© 2001 Scripta Technica
Systems and Computers in Japan, Vol. 32, No. 3, 2001
Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J82-D-II, No. 4, April 1999, pp. 703711
56