没有合适的资源?快使用搜索试试~ 我知道了~
首页强化对话生成中一致性:提升Seq2Seq模型生成质量
强化对话生成中一致性:提升Seq2Seq模型生成质量
0 下载量 87 浏览量
更新于2024-08-26
收藏 182KB PDF 举报
本文主要探讨了在对话生成任务中,如何改进序列到序列(Seq2Seq)模型的一致性问题。传统的Seq2Seq方法在单轮对话生成中表现出色,然而,它们常常倾向于生成缺乏特定含义的通用响应,这是由于这些模型在优化过程中主要依赖于Kullback-Leibler散度(KL divergence),这可能导致生成的概率较高但实际意义较低的响应被默认为较好的选择。 研究表明,Seq2Seq模型在不知道真正概率的情况下,无法有效区分概率高但真实概率低的情况。为了克服这一挑战,研究者借鉴了人类评价对话连贯性的直觉,即响应与后续信息的相似度与真实概率成正比。他们提出了一种强化学习策略,通过将连贯性分数作为奖励函数,鼓励模型生成具有更高真实概率而非仅高预测概率的响应。 作者们设计了三种一致性模型来实现这一目标:第一种是未学习的相似度函数,它直接衡量对话的前后文连贯性;第二种是预训练的语义匹配函数,利用预先训练的语言模型来评估响应的合理性;最后,他们还构建了一个端到端的双学习架构,该架构同时优化生成响应的质量和连贯性。 实验结果在中文微博数据集和英语字幕数据集上显示出,这些改进后的模型能够显著提高对话生成的精确性和意义性,无论是通过自动评估指标还是人工评估,都表现出了优于传统Seq2Seq模型的优势。这项研究为提升对话生成模型的连贯性和具体性提供了一种创新的方法,有助于生成更具交互性和自然性的对话内容。
资源详情
资源推荐
Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation
Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu and Xueqi Cheng
University of Chinese Academy of Sciences, Beijing, China
CAS Key Lab of Network Data Science and Technology,
Institute of Computing Technology, Chinese Academy of Sciences
zhanghainan@software.ict.ac.cn, {lanyanyan, guojiafeng, junxu, cxq}@ict.ac.cn
Abstract
Sequence to sequence (Seq2Seq) approach has
gained great attention in the field of single-turn
dialogue generation. However, one serious prob-
lem is that most existing Seq2Seq based models
tend to generate common responses lacking spe-
cific meanings. Our analysis show that the underly-
ing reason is that Seq2Seq is equivalent to optimiz-
ing Kullback–Leibler (KL) divergence, thus does
not penalize the case whose generated probability
is high while the true probability is low. How-
ever, the true probability is unknown, which poses
challenges for tackling this problem. Inspired by
the fact that the coherence (i.e. similarity) between
post and response is consistent with human eval-
uation, we hypothesize that the true probability of
a response is proportional to the coherence degree.
The coherence scores are then used as the reward
function in a reinforcement learning framework to
penalize the case whose generated probability is
high while the true probability is low. Three dif-
ferent types of coherence models, including an un-
learned similarity function, a pretrained semantic
matching function, and an end-to-end dual learn-
ing architecture, are proposed in this paper. Ex-
perimental results on both Chinese Weibo dataset
and English Subtitle dataset show that the pro-
posed models produce more specific and meaning-
ful responses, yielding better performances against
Seq2Seq models in terms of both metric-based and
human evaluations.
1 Introduction
This paper focuses on the problem of single-turn dialogue
generation, which is expected to automatically generate an
appropriate response for a given post. Following conven-
tional data-driven generation framework of statistical ma-
chine translation, most existing neural conversation models
are based on a Seq2Seq architecture
[
Sutskever et al., 2014
]
.
In these models, a recurrent neural network (RNN) encoder is
first utilized to encode the input post to a vector, and another
RNN decoder is then used to generate the response. To learn
the model parameters, a maximum likelihood estimation ap-
proach is applied on the training data which consists of many
post-response pairs. The intrinsic philosophy is that the true
probability would be estimated by the generated probability
with proper parameters.
Though Seq2Seq has the ability to generate fluent re-
sponses, one serious problem is that the generated responses
are usually common, such as ‘I do not know’, ‘What does
this mean?’ and ‘Haha’
[
Li et al., 2016a; Mou et al., 2017
]
.
Clearly, these kinds of responses lack specific meanings for
further widening and deepening of the dialogue, which will
have a bad effect on the users’ experience. Through our anal-
ysis, the main reason is that the objective of Seq2Seq is equiv-
alent to minimizing the KL divergence between the generated
probability and the true probability. However, KL divergence
is not symmetric, thus it will not penalize the case whose gen-
erated probability is high while the true probability is low,
which is exactly the case of common responses.
In this paper, we propose to utilize the coherence (i.e. sim-
ilarity) between the generated responses and the original
post as an estimation of the true probability, with inspira-
tion comes from the fact that the similarity measure between
post and response embeddings is consistent with human eval-
uation. Specifically, three kinds of coherence models are
adopted in this paper. Firstly, an unlearned similarity func-
tion, such as cosine similarity, can be directly used as the co-
herence model. Secondly, the previous semantic text match-
ing models can be regarded as good candidates for measur-
ing the coherence between a post and its corresponding re-
sponse. In this paper, we use two pretrained matching func-
tions, i.e., GRU bilinear model
[
Socher et al., 2013
]
and
MatchPyramid
[
Pang et al., 2016
]
, which are representatives
of two different kinds of deep matching models, i.e., repre-
sentation focused methods and interaction focused methods.
Thirdly, an end-to-end dual learning architecture similar to
[
Xia et al., 2016
]
can be adopted to jointly learn the parame-
ters of response generation model and coherence model. Af-
ter that, the coherence model is used as the reward function in
a reinforcement learning framework for optimization, which
will guide the learning process to penalize the case whose
generated probability is high while the true probability is low.
We evaluate the proposed models on two public datasets,
i.e. the Chinese Weibo and the English Subtitle dataset. Ex-
perimental results show that our models significantly outper-
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
4567
下载后可阅读完整内容,剩余6页未读,立即下载
weixin_38549327
- 粉丝: 4
- 资源: 931
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 李兴华Java基础教程:从入门到精通
- U盘与硬盘启动安装教程:从菜鸟到专家
- C++面试宝典:动态内存管理与继承解析
- C++ STL源码深度解析:专家级剖析与关键技术
- C/C++调用DOS命令实战指南
- 神经网络补偿的多传感器航迹融合技术
- GIS中的大地坐标系与椭球体解析
- 海思Hi3515 H.264编解码处理器用户手册
- Oracle基础练习题与解答
- 谷歌地球3D建筑筛选新流程详解
- CFO与CIO携手:数据管理与企业增值的战略
- Eclipse IDE基础教程:从入门到精通
- Shell脚本专家宝典:全面学习与资源指南
- Tomcat安装指南:附带JDK配置步骤
- NA3003A电子水准仪数据格式解析与转换研究
- 自动化专业英语词汇精华:必备术语集锦
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功