互动问答中的上下文感知注意力网络

29 浏览量更新于2024-08-28 收藏 860KB PDF 举报

本文主要探讨了"Context-aware Attention Network for Interactive Question Answering"（基于上下文感知注意力的互动问答网络），这是一项针对神经网络在序列到序列模型框架下的研究。传统上，基于神经网络的问答系统（Question Answering, QA）模型主要依赖于从陈述和问题中预测答案，然而这些模型往往忽视了详细上下文信息和系统的未知状态，这在实际的互动问答（Interactive Question Answering, IQA）场景中是个重要的挑战。在互动问答环境中，系统可能没有足够的信息来直接回答用户的问题，尤其是在缺乏完整或模糊的信息时。为了克服这个问题，研究人员提出了一个创新的方法，设计了一种结合上下文依赖的词级注意力机制和问题导向的句子级注意力机制。词级注意力允许模型根据当前上下文动态地关注关键词语，从而提高陈述的精确表示能力。而句子级注意力则引导模型更好地理解和整合整个语境，以便更准确地处理交互过程中的复杂情境。通过这种方式，该模型能够生成独特的IQAD（Interactive Question Answering Dynamics），即在交互过程中考虑到上下文变化和动态推理的能力。这种方法不仅提升了模型在处理复杂问题时的性能，还使得系统能够适应不断变化的用户需求和环境条件，从而在互动问答任务中展现出更强的适应性和准确性。此外，文章可能还探讨了实验设置、数据集选择、模型训练方法以及与传统模型的对比分析，以证明其在处理IQA任务上的优势。这篇研究论文旨在推动神经网络在互动问答领域的技术进步，通过上下文感知的注意力机制解决实际应用中的信息不足问题，为智能问答系统的发展提供了新的思路和改进方向。

A Context-aware Aention Network for Interactive

estion Answering

∗

Huayu Li

, Martin Renqiang Min

, Yong Ge

, Asim Kadav

Department of Computer Science, UNC Charloe

Machine Learning Group, NEC Laboratories America

Management Information Systems, University of Arizona

hli38@uncc.edu,{renqiang,asim}@nec-labs.com,yongge@email.arizona.edu.

ABSTRACT

Neural network based sequence-to-sequence models in an encoder-

decoder framework have been successfully applied to solve es-

tion Answering (QA) problems, predicting answers from statements

and questions. However, almost all previous models have failed to

consider detailed context information and unknown states under

which systems do not have enough information to answer given

questions. ese scenarios with incomplete or ambiguous infor-

mation are very common in the seing of Interactive estion

Answering (IQA). To address this challenge, we develop a novel

model, employing context-dependent word-level aention for more

accurate statement representations and question-guided sentence-

level aention for beer context modeling. We also generate unique

IQA datasets to test our model, which will be made publicly avail-

able. Employing these aention mechanisms, our model accurately

understands when it can output an answer or when it requires gen-

erating a supplementary question for additional input depending

on dierent contexts. When available, user’s feedback is encoded

and directly applied to update sentence-level aention to infer an

answer. Extensive experiments on QA and IQA datasets quantita-

tively demonstrate the eectiveness of our model with signicant

improvement over state-of-the-art conventional QA models.

KEYWORDS

estion Answering; Interactive estion Answering; Aention;

Recurrent Neural Network

1 INTRODUCTION

With the availability of large-scale QA datasets, high-capacity ma-

chine learning/data mining models, and powerful computational

devices, research on QA has become active and fruitful. Commer-

cial QA products such as Google Assistant, Apple Siri, Amazon

Alexa, Facebook M, Microso Cortana, Xiaobing in Chinese, Rinna

in Japanese, and MedWhat have been released in the past several

years. e ultimate goal of QA research is to build intelligent sys-

tems capable of naturally communicating with humans, which

∗

Most of this work was done when the rst author was an intern at NEC Labs America.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permied. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

KDD’17, August 13–17, 2017, Halifax, NS, Canada.

DOI: hp://dx.doi.org/10.1145/3097983.3098115

poses a major challenge for natural language processing and ma-

chine learning. Inspired by recent success of sequence-to-sequence

models with an encoder-decoder framework [

], researchers

have aempted to apply variants of such models with explicit mem-

ory and aention to QA tasks, aiming to move a step further from

machine learning to machine reasoning [

]. Similarly, all

these models employ encoders to map statements and questions

to xed-length feature vectors, and a decoder to generate outputs.

Empowered by the adoption of memory and aention, they have

achieved remarkable success on several challenging public datasets,

including the recently acclaimed Facebook bAbI dataset [24].

However, previous models suer from the following impor-

tant limitations [

]. First, they fail to model context-

dependent meaning of words. Dierent words may have dierent

meanings in dierent contexts, which increases the diculty of

extracting the essential semantic logic ow of each sentence in

dierent paragraphs. Second, many existing models only work

in ideal QA seings and fail to address the uncertain situations

under which models require additional user input to gather com-

plete information to answer a given question. As shown in Table 1,

the example on the top is an ideal QA problem. We can clearly

understand what the question is and then locate the relevant in-

put sentences to generate the answer. But it is hard to answer the

question in the boom example, because there are two types of bed-

rooms mentioned in all input sentences (i.e., the story) and we do

not know which bedroom the user refers to. ese scenarios with

incomplete information naturally appear in human conversations,

and thus, eectively handling them is a key capability of intelligent

QA models.

To address the challenges presented above, we propose a Context-

aware Aention Network (CAN) to learn ne-grained represen-

tations for input sentences, and develop a mechanism to interact

with user to comprehensively understand a given question. Specif-

ically, we employ two-level aention applied at word level and

sentence level to compute representations of all input sentences.

e context information extracted from an input story is allowed

to inuence the aention over each word, and governs the word

semantic meaning contributing to a sentence representation. In

addition, an interactive mechanism is created to generate a supple-

mentary question for the user when the model feels that it does not

have enough information to answer a given question. User’s feed-

back for the supplementary question is then encoded and exploited

to aend over all input sentences to infer an answer. Our proposed

model CAN can be viewed as an encoder-decoder approach aug-

mented with two-level aention and an interactive mechanism,

rendering our model self-adaptive, as illustrated in Figure 1.

下载后可阅读完整内容，剩余8页未读，立即下载

weixin_38669881

粉丝: 5
资源: 918

互动问答中的上下文感知注意力网络

深度学习驱动的青光眼筛查：Disc-aware Ensemble Network

对抗恶意软件：Spybot与Ad-aware的防御策略

使用姿态引导的注意感知组合网络进行人员重识别

Context-aware-Visual-Tracking.zip_Aware_Context-aware_context aw

Context-Aware User Association for Energy Cost Saving in A Green Heterogeneous Network with Hybrid Energy Supplies

Context-Aware Distributed Service Provisioning based on Anycast for Information-Centric Network

Disc-aware Ensemble Network for Glaucoma Screening from Fundus Image.pdf

Context-Aware Embeddings for Automatic Art Analysis.pdf

MAC protocol in wireless body area networks for E-health: challenges and a context-aware design

Multiscale Context-aware Ensemble Deep KELM for Efficient Hypers

最新资源