大规模语料库中的状中搭配库构建

需积分: 9 23 浏览量更新于2024-09-02 1 收藏 443KB PDF 举报

"该资源是一篇关于基于大规模语料库构建副动词搭配数据库的研究论文。作者通过前期研究和语言规则建立副动词搭配的知识体系，并设计实现了大规模语料库中的副动词搭配知识获取模型。文章的主要目标是通过形式化方法获取高质量的副动词搭配，为自然语言处理和基础语言学及应用研究提供数据支持。关键词包括：大规模语料库、知识提取、副动词搭配。" 本文主要探讨了如何利用大规模语料库构建高质量的副动词搭配数据库，这对理解和处理汉语中的复杂语言现象具有重要意义。由于汉语缺乏形态变化，短语通常由多个词汇组成，内部层次关系复杂，使得副动词搭配的学习和应用变得尤为困难。因此，建立一个系统的、基于大规模数据的副动词搭配库显得至关重要。首先，研究人员依据先前的研究成果和语言学规则建立了一个副动词搭配的知识系统。这一系统包含了丰富的语义和句法信息，为后续的数据挖掘和分析奠定了基础。知识系统的设计考虑了汉语的特点，如词序、语境影响以及搭配的频度等因素。接下来，他们设计并实现了一种基于大规模语料库的副动词搭配知识获取模型。这种模型可能采用了诸如统计分析、机器学习或者深度学习的方法，通过对语料库中的大量文本进行自动分析，识别出频繁出现且符合语言习惯的副动词组合，从而提取出高质量的搭配信息。在模型的评估和分析阶段，作者们对提取的结果进行了验证，可能包括准确性、覆盖率和稳定性等方面的评估。这一步骤确保了所构建的数据库能够准确反映实际语言中的副动词搭配模式，避免了错误信息的引入。最后，这个副动词搭配数据库的建立，不仅对于自然语言处理（NLP）任务如机器翻译、语义理解等提供了关键的数据支持，还为语言学的基础研究和应用研究，比如语料库语言学、词汇语法学等领域提供了宝贵的资源。通过这样的数据库，研究者可以更好地理解和探索汉语中副动词搭配的规律，进一步推动汉语处理技术的发展。这篇论文通过结合语言学理论和大数据分析，提出了一种有效构建副动词搭配数据库的方法，为提高汉语处理的智能化水平提供了重要的理论和技术支撑。

this paper carries out the knowledge extraction project of adverbial-verb collocation

based on the features of part of speech, word length, pause, rhythm and language rules,

which is innovative in methods and provides a large-scale real data for further

research based on statistics and machine learning and the examples of collocation for

language ontology, teaching and applied research.

3 Construction of the Knowledge System of Adverbial-verb

Collocations

There have been some previous studies on adverbial-verb structures. For example,

Zhu classified adverbials into two categories: adverbial modifier and adjectival

modifier. Adverbial modifier includes adverbs that are transformed by the verb and

nouns with adverbial suffixes 的(de, auxiliary); adjectival adverbials include state

adjectives, some compound words with state adjective suffixes can be used as

adverbials. Zhu also thinks some substantives have the nature of the predicate so they

are also modified by adverbials, such as numerals, quantifiers [31]. Xing thinks it

usually defines adverbial as the modifier of verbs and adjectives. The adverbial verb

can also be classified by noun or noun phrase. From the semantic point of view to

classify the adverbial, it mainly includes the state adverbial, potential adverbial,

degree adverbial, negative adverbial, condition adverbial, and object adverbial [32].

This paper establishes an adverbial-verb collocation system based on previous studies.

The extracted adverbial collocations only refer to the simple non-recursive situations

in which modifiers modify predicate headwords, such as 紧紧地抓住(jin3jin3 de

zhua1zhu4, hold tightly), 不断地提高(bu2duan4 de ti2gao1, improve constantly), 很

喜欢(hen3xi3huan1, like very much) and so on. Because of the particularity of the

data extraction by computer, we do not consider the complex adverbial-verb structure

for the time being, and the nouns which modify the predicate components such as 群

众的支持(qun2zhong4 de zhi1chi2, mass support) and modifier which modifies the

nominals, such as 很淑女(hen3shu1nv3, very lady) and 才周二(cai2zhou1er4, only

Tuesday).

The adverbial-verb collocations in this paper are mainly centered on the predicative

headword. It can be divided into three types in the form: adverbial + verb, adverbial+

adjective, adverbial + predicate pronoun. The adverbial-verb structure of complex

adverbials and complex headwords is excluded for the time being. As shown in the

table below:

Table 1. Classification table of adverbial-verb collocations

Adverbial collocation

classification

Adverbial classification

Adverbial + Verb

Adverbial modifiers

Adjective modifiers

Verb modifiers

Nominal modifiers

Adverbial modifiers of predicative pronouns

剩余11页未读，继续阅读

nino_summer

粉丝: 2
资源: 1

大规模语料库中的状中搭配库构建

"翡冷翠美酒之旅：现在分词在语法中的应用与意义"。

"外研版必修2 Module 3音乐 课时四：时间状语从句与过去完成时解析

英语句子成分解析：主谓、主系表及基本句型

高中英语语法----句子成分分析.doc

英语句子成分(MembersofaSentence)-10页.pdf

独立从句dependentclauses.pdf

雅思英语语法手册.pdf

英语句子成分与结构.pdf

英语句型分类(1).pdf

课件_Revisinging_form_as_adverbial.pptx

最新资源

"外研版必修2 Module 3音乐课时四：时间状语从句与过去完成时解析