协同训练改进的RNN与概率模型提升命名实体识别效果

44 浏览量更新于2024-08-26 收藏 936KB PDF 举报

本文探讨了在自然语言处理（NLP）领域中的一个关键任务——命名实体识别（Named Entity Recognition, NER），它作为信息提取的一个子任务，受到广泛关注。目前，递归神经网络（Recurrent Neural Network, RNN）因其在处理序列数据上的优势，已经成为NER任务中的热门方法。然而，RNN模型的一个主要挑战是需要大量的标注训练数据，这对于资源有限的情况而言是个难题。针对这一问题，论文提出了利用概率统计模型来辅助RNN进行联合训练，以此减少对标注数据的依赖。具体来说，作者采用了一种新颖的方法，将RNN与两种概率统计模型相结合：隐马尔可夫模型（Hidden Markov Model, HMM）和条件随机场（Conditional Random Field, CRF）。HMM擅长捕捉序列数据中的时序关系，而CRF则在标注序列建模方面表现出色。作者首先对传统的RNN模型进行了改进，通过重新定义其内部结构或参数，使其能更好地融合HMM和CRF的特性。这种混合模型的优势在于，一方面，RNN能够捕捉上下文中的复杂动态特征；另一方面，HMM和CRF的结构提供了额外的统计推断能力，有助于更精确地识别命名实体。在实际的协同训练过程中，模型会交替地使用RNN和这两种统计模型的预测结果进行迭代学习，从而不断优化它们之间的性能。这种方法可以有效地利用未标注的数据，提升模型在有限标注数据下的泛化能力，减少了标注数据的需求。这篇研究论文创新性地将概率统计模型与改进的RNN结合，旨在解决命名实体识别任务中的数据稀缺问题，展示了在NLP领域中跨模型融合的有效应用，具有很高的实用价值和理论研究意义。对于那些关注命名实体识别、深度学习以及数据增强策略的科研人员和实践者来说，这篇论文提供了有价值的技术参考和思考方向。

Co-training an Improved Recurrent Neural Network 547

2 Related Work

As described in [5], there are three kind of methods for named entity recognition:

dictionary-based methods, rule-based methods and statistical machine learning

methods which rely on diﬀerent theories. NER can be solved by machine learning

methods, such as CRF [6,7], Support Vector Machine (SVM) [8], HMM [9]etc.

These methods are commonly used for NER these years in a way of supervised

learning. In addition, semi-supervised methods are also one road to this task

when labeled data is diﬃculty to obtain.

Recently, while the probability statistical models perform well in many ﬁelds,

deep neural networks as a new wave tide in machine learning, have achieved great

performances in many domains such as image classiﬁcation [10], knowledge dis-

covery [11] and translation [12] etc. Collobert et al. [13] propose a uniﬁed neural

network architecture and learning algorithm to do various NLP tasks and also

achieved a better result for NER task. Compared to the well-known Convo-

lutional Neural Network (CNN) which has achieved remarkable performances

in image domain, RNN can exploit the time-connection feedback thus capture

dependencies beyond the input window. Therefore, RNN architecture is more

suitable for NER. Song et al. [14] build a simple and eﬃcient system for bio-NER

based on Recurrent Neural Network (RNN). Jason P.C. Chiu and Eric Nichols

[15] present a novel neural network architecture that can automatically detect

word and character level features using a hybrid bidirectional Long Short-Term

Memory (LSTM) and CNN architecture.

On the other hand, as described in [16], a deep neural network is characterized

by a set of weight matrices, bias vectors, and a nonlinear activation function,

which gives a deep neural network the learning ability of hierarchical nonlinear

mapping. But in model parameter training, weight matrices and bias vectors are

updated using an error back-propagation algorithm whereas activation function

is not. So the change of activation function is important for a neural network,

which can speed up model training [17], enhance stability [18]. In this paper, we

adopt the RNN model and modify its activation function to do NER task.

Another problem for RNN is that it needs plenty of train data. Hence in this

paper we consider a co-training method which is one of useful solutions when

train data is in lack. Co-training, one of the semi-supervised learning methods,

was ﬁrst proposed in 1998 and also has been used in NER. Tsendsuren et al.

[19] present an Active Co-Training (ACT) algorithm for biomedical named-entity

recognition. Li et al. [20] propose a semi-supervised approach to extract bilingual

named entity and used a bilingual co-training algorithm to improve the named

entity annotation quality. But using RNN to do co-training is a few in NER

researches [21] and most of them are about biomedical domain. In this paper,

we aim to explore the performance when co-training an improved RNN with

probability statistic models for NER task.

剩余10页未读，继续阅读

weixin_38519082

粉丝: 1
资源: 947

协同训练改进的RNN与概率模型提升命名实体识别效果

基于深度神经网络的法语命名实体识别模型.pdf

卷积神经网络和递归神经网络（构建神经网络，进行数据处理，包括卷积神经网络和递归神经网络）

基于深度主动学习的信息安全领域命名实体识别研究.pdf

求解一个线性规划的递归神经网络模型

递归神经网络python

递归神经网络使用场景

递归神经网络的发展史

递归神经网络和前馈神经网络

递归神经网络对比lstm

递归神经网络就是循环神经网络嘛

最新资源