混合模型驱动的语言学习复述故事识别：减少61.6%的困惑度

121 浏览量更新于2024-08-29 收藏 997KB PDF 举报

在《适应语言模型以识别语言学习中的复述故事》这篇研究论文中，作者们探讨了如何在语言学习情境下提高复述故事任务的语言模型性能，尤其是在缺乏大量领域内（in-domain）训练数据的情况下。传统的n-gram语言模型通常依赖于大量的匹配任务主题和风格的数据，这对于获取复述故事所需的语音转录来说是不切实际的。因此，研究者提出了一个创新的方法，即混合模型语言建模。该方法首先对语言模型进行了三个方面的分离建模：主题特定的语言模型、口语风格模型以及文档风格模型。这样做可以捕捉到不同情境下的语言特点，从而更准确地预测和理解复述过程中的语言。接着，他们将这些单独的模型进行融合，通过线性或非线性方式的混合，如加权平均，来综合各个模型的优点，提高整体预测的准确性。此外，论文还探讨了将基于类别的语言模型与n-gram模型相结合，进一步增强了模型的鲁棒性和灵活性。实验结果显示，采用这种混合语言模型的方法能够显著提升模型的表现，相比于传统方法，最佳模型能够降低至多61.6%的困惑度（perplexity），以及20.7%的词错误率（Word Error Rate, WER）。这表明该方法对于减少语言学习中的复述故事任务的难度和提高理解精准度具有明显的优势。论文的关键贡献在于提出了一种在资源受限情况下有效适应语言学习任务的策略，这对于实际应用中的自然语言处理和教育技术具有重要的实践价值。随着计算机技术的发展，这种方法可能被广泛应用于个性化教学、自动评估以及语言技能的自适应训练系统中，帮助提高学习者的语言表达和复述能力。

Adapted Language Modeling for Recognition of Retelling Story in Language

Learning

Meng Chen, Yang Song, Lan Wang

Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences/The Chinese

University of Hong Kong

{chenmeng, yang.song, lan.wang}@siat.ac.cn

Abstract

N-gram language modeling typically requires large

quantities of in-domain training data, i.e., data that

matches the task in both topic and style. For the task of

retelling stories, obtaining large volumes of speech

transcriptions is often unrealistic. In this paper, we

propose a novel method of language modeling using

mixture models with very limited text datain the task of

retelling stories. We modeled topic-specific, spoken-

style, and document-style language models separately

and interpolated them. We also interpolated the class-

based language model with the N-gram models.

Experimental results show that up to 61.6% reduction

of perplexity and 20.7% reduction of word error rate

(WER) have been obtained by our best performing

model.

1. Introduction

With the development of computer technology,

Computer AssistedLanguage Learning (CALL) system

has offered great advantages over traditional language

learning methods. Retelling stories has been presented

to the language learner to evaluate his/her oral

proficiency. Automatic scoring based on Automatic

Speech Recognition (ASR)to evaluate the speaking

ability in the task of retelling stories has been studied

recently. In the task of retelling, students listen to a

monologue of story (200~300 words) spoken by a

native speaker, and then retell the story with their own

words. The audios of students are non-native

spontaneous speech with specific spoken style, which

is not only different with the original story but also

contains lexical and syntactic errors.

For spontaneous speech recognition, researchers

have made numerous efforts to increase the ASR

accuracy by employing a variety of improved language

modeling techniques.In the study of [1], the authors

constructed the language model for spontaneous

speech by combination of written text from textbooks

and transcripts of conversational telephone speech of

Switchboard and Fisher corpora. Another work of [2]

presented a method of generating simulated spoken-

style text by randomly inserting fillers into written-

style text. However, this approach handles only fillers,

and doesn’t consider features like repeat and self-repair.

G. Moore and S. Young [3] used class-based language

models for robust estimation of N-gram probabilities

with limited or unmatched data. Akita and Kawahara

[4] proposed the other approach using a probabilistic

transformation model trained from a parallel aligned

corpus of the faithful transcripts and their written-style

texts. However, it is quite difficult to obtain such

aligned corpus. All the above efforts mainly focused on

spontaneous speech of native speakers, few researchers

have explored the language modelingfor the task of

non-native spontaneous speech recognition.

Although these language model improvement

techniques are undoubtedly helpful, they either need

large amounts of closely matched data, or can only

cover limited features of spontaneous speech style. For

a task of retelling stories, the students are required to

repeat the story based on what they heard, and they

would organize the sentences with their own words

when they can’t remember the exact words used by the

native speakers. Therefore,the speech of students is

non-native spontaneous speech with three specific

features. Firstly, the speech is closely related to the

original story in topic, but not restricted with the

vocabulary of original story. Secondly, there are lots of

disfluencies, such as filled pauses, hesitation,

repeatedwords and self-repaired words. Thirdly, it

contains various lexical and syntactic errors since the

speakers are non-native and their oral abilities are far

from that of native speakers. Due to thesespecific

features of retelling speech, transcripts of telephone

conversations or newswire text are obviously not

suitable.

In this paper we proposed an effective method to

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38712874

粉丝: 10
资源: 947

混合模型驱动的语言学习复述故事识别：减少61.6%的困惑度

藏文陈述句复述生成之计算机研究.docx

win7语音识别

语音识别程序，通过麦克风输入语音信号，系统会识别语音，并跟读.zip

请推荐 断言 PSL语言 相关书籍

强化学习中学生教师策略

我使用的是C#语言，你可以用C#的案例复述一次吗

最新资源

请推荐断言 PSL语言相关书籍