自动提取音素的随机模型驱动的语音识别

论文

需积分: 9 71 浏览量更新于2024-08-07 收藏 222KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

资源详情

资源推荐

Speech Recognition Using Stochastic Phonemic Segment Model

Based on Phoneme Segmentation

Chieko Furuichi, Katsura Aizawa, and Kazuhiko Inoue

Faculty of Engineering, Toin University of Yokohama, 1614 Kurogane, Midori, Yokohama, Ja pan 225-8502

SUMMARY

This paper discusses speech recognition based on a

new statistical phoneme segment model which is trained by

phoneme parameters derived from automatically extracted

phoneme segments. The proposed system operates as fol -

lows. In preprocessing before recognition, the phoneme

boundaries are detected by segmentation. The phonemes

are discriminated using a stochastic phoneme segment

model, and a phoneme segment lattice with scores is con-

structed. Next the speech recognition is performed by

matching of symbol sequences to dictionary items. The

segmentation system that is employed can infer phoneme

boundaries with high accuracy. This helps to eliminate

unnecessary parameters, leaving the feature parameters

which are effective in separating phonemes. In other words,

the phoneme recognition problem in continuous speech can

be reduced to a discrimination problem and thus a speaker-

independent model can be constructed from a relatively

small number of training data. The stochastic phoneme

segment model is trained with training samples extracted

from a phoneme-balanced word set of 4920 words uttered

by 10 speakers. In a recognition experiment with 6709

words uttered by 63 nontraining speakers, a recognition rate

of 92.6% was obtained as the average for all speakers, using

Syst Comp Jpn, 31(10): 8998, 2000

Key words: Segment model; mixed distribution;

phoneme segmentation; speech recognition.

1. Introduction

In continuous speech recognition systems, it is desir-

able to improve the accuracy of the acoustic model in order

to improve the recognition rate for speech units such as

phonemes and syllables. In recent years, many studies of

segment models have attempted to include the temporal

changes of the speech feature parameters in order to im-

prove the accuracy of the acoustic model [14]. When a

segment model is applied to recognition, the dimension of

the parameters is usually increased. If the amount of train-

ing data is insufficient, the estimation accuracy of the model

may be degraded, or a large amount of computation may be

needed for recognition. Approaches to dealing with this

problem have included compression of the parameter di-

mension by K-L expansion [5], and use of the output from

a neural network into which several consecutive frames are

simultaneously input [6].

In the recognition of continuous speech by the seg-

ment model, there can be two approaches. One is to perform

recognition without applying preliminary segmentation.

The other is to detect the boundaries between phonemes or

syllables by segmentation, and then to perform recognition

using the segment model. The former method has been used

more often, since segmentation is very difficult and a sys-

tem accurate enough to be used for preprocessing before

recognition is difficult to create.

If the boundaries between phonemes or syllables can

be estimated with high accuracy by the latter method,

however, the problem of recognizing phonemes or syllables

in continuous speech can be reduced to a discrimination

problem, unnecessary searching can be minimized, and the

Systems and Computers in Japan, Vol. 31, No. 10, 2000

Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J82-D-II, No. 7, July 1999, pp. 11111119

下载后可阅读完整内容，剩余9页未读，立即下载

weixin_38656337

粉丝: 4
资源: 921

自动提取音素的随机模型驱动的语音识别

语音识别的原理

混合高斯模型（GMM）

利用马尔科夫随机场实现音素识别，包括声学模型和语言模型的建模方法

如何利用马尔科夫随机场进行语音识别建模，包括基于隐马尔科夫模型（HMM）和马尔科夫条件随机场（MCRF）的方法

语音识别基于统计的方法

基于matlab的gmm-hmm语音识别

使用python代码使用librosa将timit语料库按照音素分割、提取并保存mfcc特征

语音特征识别算法包括哪些技术

帮我写一下基于深度学习的语音识别系统

怎么实现HMM音素建模

详细叙述语音识别技术

端到端视听语音识别框架

基于科大讯飞语音识别demo(离线)

ld3320语音识别模块工作原理

语音识别是如何识别出各地方言的

paddlepaddle语音识别

transformer语音识别原理

rbf神经网络语音识别

主流的语音识别算法gmm-hmm

HMM在语音识别的原理

最新资源