Context-LSTM-CNN：深度神经网络句子分类新方法

需积分: 10 27 浏览量更新于2024-09-08 收藏 222KB PDF 举报

"DNN Sentence Classification - 利用深度神经网络和上下文信息进行句子级分类的方法" 在自然语言处理（NLP）领域，句子分类是一项重要的任务，它涉及到识别和理解文本中的语义和情感。传统的句子分类方法通常只关注单个句子本身的信息，而忽略其上下文对分类结果的影响。然而，相邻句子提供的上下文信息对于准确地理解句子的含义和进行分类是至关重要的。"DNN Sentence Classification"提出了一种新的方法，名为Context-LSTM-CNN，它有效地利用了大量上下文信息，并结合了句子内部的长期依赖和短跨度特征。该方法的核心在于结合了长短期记忆网络（LSTM）和卷积神经网络（CNN）。LSTM 能够捕捉句子内部的长距离依赖关系，这对于理解和解析复杂的句子结构至关重要，尤其是在处理如时态、因果等长期依赖的语境时。而CNN则用于提取句子中的局部特征，这些特征通常对应于词汇或短语的组合，能够捕捉到句子的短跨度模式，如关键词或情感表达。 Context-LSTM-CNN模型首先通过LSTM层处理单个句子，以捕捉其内在的序列依赖。然后，模型引入了上下文信息，将相邻句子的表示与当前句子的表示结合起来，这使得模型能够利用更广泛的情境背景来辅助分类决策。CNN层堆叠在LSTM之上，进一步提取多尺度的特征，这有助于发现不同粒度的语义模式。实验结果显示，Context-LSTM-CNN方法在两个不同的数据集上都优于先前的方法，表明其在利用上下文信息和融合多种特征方面具有显著优势。这表明，在NLP任务中，充分考虑上下文并结合深度学习模型可以提高句子分类的性能和准确性。这项研究强调了上下文信息在句子分类中的重要性，并提供了一个有效的解决方案，即通过结合LSTM和CNN的优势来处理和利用这些信息。这种方法不仅对于学术研究有重要意义，也为实际应用如情感分析、主题检测和信息检索等领域提供了强大的工具。

A Deep Neural Network Sentence Level Classiﬁcation Method with

Context Information

Xingyi Song and Johann Petrak

Department of Computer Science

University of Shefﬁeld

Shefﬁeld, UK

{x.song, johann.petrak}@sheffield.ac.uk

Angus Roberts

NIHR Biomedical Research Centre

Institute of Psychiatry, Psychology and Neuroscience

Kings College London

London, UK

angus.roberts@kcl.ac.uk

Abstract

In the sentence classiﬁcation task, context

formed from sentences adjacent to the sen-

tence being classiﬁed can provide important

information for classiﬁcation. This context is,

however, often ignored. Where methods do

make use of context, only small amounts are

considered, making it difﬁcult to scale. We

present a new method for sentence classiﬁca-

tion, Context-LSTM-CNN, that makes use of

potentially large contexts. The method also

utilizes long-range dependencies within the

sentence being classiﬁed, using an LSTM, and

short-span features, using a stacked CNN. Our

experiments demonstrate that this approach

consistently improves over previous methods

on two different datasets.

1 Introduction

Artiﬁcial neural networks (ANN) and especially

Deep Neural Networks (DNN) give state-of-the

art results for sentence classiﬁcation tasks. Usu-

ally, sentences are treated as separate instances for

the task. However, in many situations the sen-

tence that is the focus of classiﬁcation appears

in a context that can provide additional informa-

tion. For example, in the below sentences from the

IEMOCAP dataset, it is difﬁcult to classify M02 as

showing excitement, without the prior context:

• M01: I got it. I got accepted to U.S.C..

• F01: Oh, for real?

• M02: Yes! I just found out today. I just got the letter.

Our work is motivated by sentence classiﬁca-

tion in the text of medical records, in which com-

plex judgements may be made across several sen-

tences, each adding weight and nuance to a point.

We believe, however, that the techniqe is more

widely applicable. In order to test generalisability

and to allow reproducibility, we therefore present

an evaluation of the method with publicy avail-

able, non-medical corpora.

Previous work on using context for sentence

classiﬁcation used LSTM and CNN network lay-

ers to encode the surrounding context, giving an

improvement in classiﬁcation accuracy (Lee and

Dernoncourt, 2016). However, the use of CNN

and LSTM layers imposes a signiﬁcant computa-

tional cost when training the network, especially

if the size of the context is large. For this reason,

the approach presented in (Lee and Dernoncourt,

2016) is explicitly intended for sequential, short-

text classiﬁcation.

In many cases, however, the context available is

of signiﬁcant size. We therefore introduce a new

method, Context-LSTM-CNN

, which is based

on the computationally efﬁcient FOFE (Fixed Size

Ordinally Forgetting) method (Zhang et al., 2015),

and an architecture that combines an LSTM and

CNN for the focus sentence. The method consis-

tently improves over results obtained from either

LSTM alone, CNN alone, or these two combined,

with little increase in training time.

This paper makes three contributions: 1) a

demonstration of the importance of context in

some sentence classiﬁcation tasks; 2) an adapta-

tion of existing datasets for such sentence classiﬁ-

cation tasks, in order to support reproducibility of

evaluations; 3) a neural architecture for sentence

classiﬁcation that outperforms previous methods,

and can include context of arbitrary size without

incurring a large computational cost.

2 Related work

Since their introduction (Collobert et al., 2011),

CNNs with word embedding language models

have become common for text classiﬁcation tasks

(Kim, 2014; Conneau et al., 2017). One limi-

tation of the original CNN approach is the loss

The code is publicly available at

https://github.com/deansong/contextLSTMCNN

arXiv:1809.00934v1 [cs.IR] 31 Aug 2018

下载后可阅读完整内容，剩余4页未读，立即下载

chengsl_2010

粉丝: 0
资源: 13

Context-LSTM-CNN：深度神经网络句子分类新方法

DNN免费皮肤提供给DNN爱好者下载

DNN_深度神经网络_多目标_DNN_DNN可以用来_DNN网络

matlabdemo.zip_DNN_DNN matlab_DNN神经网络_dnn matlab实现_神经网络

专用dnn和通用dnn的区别

accuracy_DNN = (TP_DNN + TN_DNN) / (TP_DNN + TN_DNN + FP_DNN + FN_DNN) print("Accuracy_DNN: ", accuracy_DNN) 增加一个固定阈值为0.5的代码

cv::dnn::DNN_BACKEND_CUDA

cv::dnn::DNN_TARGET_CUDA

enable_mkldnn

dnn分类识别matlab

最新资源