基于词-方面关联融合的神经模型提升ABSA性能

需积分: 9 45 浏览量更新于2024-09-03 收藏 325KB PDF 举报

"Learning to Attend via Word-Aspect Associative Fusion for Aspect-based Sentiment Analysis" 是一篇针对情感分析领域的重要研究论文。在当前的自然语言处理中，特别是在情感分析任务中，尤其是方面情感分析（Aspect-based Sentiment Analysis, ABSA），其目标是精确地预测文档中特定方面实体的情绪倾向。传统的神经网络架构在整体情感预测上表现出色，但在处理涉及特定方面的复杂情感表达时仍面临挑战。论文提出了一种创新的方法，即Aspect Fusion LSTM (AF-LSTM)，这是一种融合方面信息的语言模型。AF-LSTM的核心在于它通过构建词-方面关系来集成方面信息，而非简单地依赖于词与方面相似度的朴素拼接。这种方法避免了先前模型在处理关联性方面的不足，能够更有效地聚焦于给定方面术语相关的词语，从而提高了模型的精度和适应性。 AF-LSTM采用了循环卷积和循环相关机制，这两个技术允许模型在句子层面捕捉到词与方面之间的动态关联性。这种关联性的学习使得模型能够以一种动态的方式决定在处理文本时哪个词汇对情感判断最重要，从而增强了模型的注意力机制。此外，AF-LSTM还具备端到端可微分性，这意味着整个模型可以无缝地进行训练，优化过程中不会丢失任何关键信息。值得注意的是，AF-LSTM与全息样（holistic）记忆密切相关，这可能意味着它能够存储并利用全局上下文信息来辅助情感分析。实验结果表明，相比于类似的技术，如ATAE-LSTM，AF-LSTM在基准数据集上的性能显著提升，平均提升了4%至5%，显示出在方面情感分析任务中的优越性能。 "Learning to Attend via Word-Aspect Associative Fusion" 为解决方面情感分析这一开放性问题提供了一个创新且有效的解决方案，通过词-方面关系的深度学习，AF-LSTM不仅提升了模型的预测准确度，还展示了在处理细致情感分析时的灵活性和智能性。这一研究对于改进未来文本分析系统的针对性和精确性具有重要意义。

Learning to Attend via Word-Aspect Associative Fusion

for Aspect-based Sentiment Analysis

Yi Tay

∗ 1

, Luu Anh Tuan

∗ 2

and Siu Cheung Hui

1, 3

Nanyang Technological University

School of Computer Science and Engineering, Singapore

Institute for Infocomm Research, Singapore

Abstract

Aspect-based sentiment analysis (ABSA) tries to predict the

polarity of a given document with respect to a given aspect

entity. While neural network architectures have been suc-

cessful in predicting the overall polarity of sentences, aspect-

speciﬁc sentiment analysis still remains as an open problem.

In this paper, we propose a novel method for integrating as-

pect information into the neural model. More speciﬁcally, we

incorporate aspect information into the neural model by mod-

eling word-aspect relationships. Our novel model, Aspect Fu-

sion LSTM (AF-LSTM) learns to attend based on associa-

tive relationships between sentence words and aspect which

allows our model to adaptively focus on the correct words

given an aspect term. This ameliorates the ﬂaws of other

state-of-the-art models that utilize naive concatenations to

model word-aspect similarity. Instead, our model adopts cir-

cular convolution and circular correlation to model the simi-

larity between aspect and words and elegantly incorporates

this within a differentiable neural attention framework. Fi-

nally, our model is end-to-end differentiable and highly re-

lated to convolution-correlation (holographic like) memories.

Our proposed neural model achieves state-of-the-art perfor-

mance on benchmark datasets, outperforming ATAE-LSTM

by 4% − 5% on average across multiple datasets.

Introduction

Sentiment analysis lives at the heart of many business and

social applications which explains its wild popularity in

NLP research. Aspect-based sentiment analysis (ABSA)

goes deeper by trying to predict polarity with respect to a

speciﬁc aspect term. For example, consider the following re-

view, ‘I love the user interface but this app is practically use-

less!’. Clearly, we observe that there are two aspects (user

interface and functionality) with completely opposite polar-

ities. As such, techniques that are able to incorporate aspect

for making predictions are not only highly desirable but also

signiﬁcantly more realistic compared to coarse-grained sen-

timent analysis. Recently, end-to-end neural networks (or

deep learning) (Wang et al. 2016; Li, Guo, and Mei 2017)

such as the long short-term memory networks (Hochreiter

and Schmidhuber 1997) and memory networks (Sukhbaatar

et al. 2015) have demonstrated promising performance on

∗

Denotes equal contribution

 2018, Association for the Advancement of Artiﬁcial

ABSA tasks without requiring any laborious feature engi-

neering.

The task of ABSA introduces a challenging problem of

incorporating aspect information into neural architectures.

As such, deep learning architectures that are able to ele-

gantly incorporate aspect information together with sentence

modeling are highly desirable. Recently, there have been a

myriad of models proposed for this purpose. For example,

ATAE-LSTM (Wang et al. 2016) is a recently incepted at-

tention based model that learns to attend to different parts

of the sentence given the aspect information. ATAE-LSTM

tries to incorporate aspect information by adopting a simple

concatenation of context words and aspect. This is done both

at the attention layer and the sentence modeling layer (inputs

to the LSTM). Consequently, the ATAE-LSTM model suf-

fers from the following drawbacks:

• Instead of allowing the attention layer to focus on learn-

ing the relative importance of context words, the attention

layer is given the extra burden of modeling the relation-

ship between aspect and context words.

• The parameters of LSTM are now given an extra burden

aside from modeling sequential information, i.e., it has to

also learn relationships between aspect and words. The

LSTM layer in ATAE-LSTM is being trained on a se-

quence that is dominated by the aspect embedding. As

such, this would make the model signiﬁcantly harder to

train.

• Naive concatenation doubles the input to the LSTM layer

in ATAE-LSTM which incurs additional parameter costs

to the LSTM layer. This has implications in terms of

memory footprint, computational complexity and risk of

overﬁtting.

In summary, the important question here is whether the

naive concatenation of aspect and words at both the LSTM

layer and attention layer is necessary or even desirable. In

fact, our early empirical experiments showed that the ATAE-

LSTM does not always outperform the baseline LSTM

model. We believe that this is caused by the word-aspect

concatenation making the model difﬁcult to train. As such,

this paper aims to tackle the weaknesses of ATAE-LSTM

while maintaining the advantages of aspect-aware atten-

tions. Our model cleverly separates the responsibilities of

layers by incorporating a dedicated association layer for

arXiv:1712.05403v1 [cs.CL] 14 Dec 2017

下载后可阅读完整内容，剩余8页未读，立即下载

一直奋斗的小猿

粉丝: 123

基于词-方面关联融合的神经模型提升ABSA性能

vue.js v2.5.17

DM8-SQL语言详解及其数据管理和查询操作指南

1108_ba_open_report.pdf

anslow_02_0109.pdf

以下是OpenCV在不同操作系统下的下载与安装教程

aronson_01_0707.pdf

Designing Deep Learning Systems. A software engineer's guide - 2023.pdf

基于豆瓣图书网站的图书数据分析与可视化

barbieri_01_0108.pdf

brown_3ck_01_0718.pdf

最新资源