强化对话生成中一致性：提升Seq2Seq模型生成质量

14 浏览量更新于2024-08-26 收藏 182KB PDF 举报

本文主要探讨了在对话生成任务中，如何改进序列到序列（Seq2Seq）模型的一致性问题。传统的Seq2Seq方法在单轮对话生成中表现出色，然而，它们常常倾向于生成缺乏特定含义的通用响应，这是由于这些模型在优化过程中主要依赖于Kullback-Leibler散度（KL divergence），这可能导致生成的概率较高但实际意义较低的响应被默认为较好的选择。研究表明，Seq2Seq模型在不知道真正概率的情况下，无法有效区分概率高但真实概率低的情况。为了克服这一挑战，研究者借鉴了人类评价对话连贯性的直觉，即响应与后续信息的相似度与真实概率成正比。他们提出了一种强化学习策略，通过将连贯性分数作为奖励函数，鼓励模型生成具有更高真实概率而非仅高预测概率的响应。作者们设计了三种一致性模型来实现这一目标：第一种是未学习的相似度函数，它直接衡量对话的前后文连贯性；第二种是预训练的语义匹配函数，利用预先训练的语言模型来评估响应的合理性；最后，他们还构建了一个端到端的双学习架构，该架构同时优化生成响应的质量和连贯性。实验结果在中文微博数据集和英语字幕数据集上显示出，这些改进后的模型能够显著提高对话生成的精确性和意义性，无论是通过自动评估指标还是人工评估，都表现出了优于传统Seq2Seq模型的优势。这项研究为提升对话生成模型的连贯性和具体性提供了一种创新的方法，有助于生成更具交互性和自然性的对话内容。

Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation

Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu and Xueqi Cheng

University of Chinese Academy of Sciences, Beijing, China

CAS Key Lab of Network Data Science and Technology,

Institute of Computing Technology, Chinese Academy of Sciences

zhanghainan@software.ict.ac.cn, {lanyanyan, guojiafeng, junxu, cxq}@ict.ac.cn

Abstract

Sequence to sequence (Seq2Seq) approach has

gained great attention in the ﬁeld of single-turn

dialogue generation. However, one serious prob-

lem is that most existing Seq2Seq based models

tend to generate common responses lacking spe-

ciﬁc meanings. Our analysis show that the underly-

ing reason is that Seq2Seq is equivalent to optimiz-

ing Kullback–Leibler (KL) divergence, thus does

not penalize the case whose generated probability

is high while the true probability is low. How-

ever, the true probability is unknown, which poses

challenges for tackling this problem. Inspired by

the fact that the coherence (i.e. similarity) between

post and response is consistent with human eval-

uation, we hypothesize that the true probability of

a response is proportional to the coherence degree.

The coherence scores are then used as the reward

function in a reinforcement learning framework to

penalize the case whose generated probability is

high while the true probability is low. Three dif-

ferent types of coherence models, including an un-

learned similarity function, a pretrained semantic

matching function, and an end-to-end dual learn-

ing architecture, are proposed in this paper. Ex-

perimental results on both Chinese Weibo dataset

and English Subtitle dataset show that the pro-

posed models produce more speciﬁc and meaning-

ful responses, yielding better performances against

Seq2Seq models in terms of both metric-based and

human evaluations.

1 Introduction

This paper focuses on the problem of single-turn dialogue

generation, which is expected to automatically generate an

appropriate response for a given post. Following conven-

tional data-driven generation framework of statistical ma-

chine translation, most existing neural conversation models

are based on a Seq2Seq architecture

[

Sutskever et al., 2014

]

In these models, a recurrent neural network (RNN) encoder is

ﬁrst utilized to encode the input post to a vector, and another

RNN decoder is then used to generate the response. To learn

the model parameters, a maximum likelihood estimation ap-

proach is applied on the training data which consists of many

post-response pairs. The intrinsic philosophy is that the true

probability would be estimated by the generated probability

with proper parameters.

Though Seq2Seq has the ability to generate ﬂuent re-

sponses, one serious problem is that the generated responses

are usually common, such as ‘I do not know’, ‘What does

this mean?’ and ‘Haha’

[

Li et al., 2016a; Mou et al., 2017

]

Clearly, these kinds of responses lack speciﬁc meanings for

further widening and deepening of the dialogue, which will

have a bad effect on the users’ experience. Through our anal-

ysis, the main reason is that the objective of Seq2Seq is equiv-

alent to minimizing the KL divergence between the generated

probability and the true probability. However, KL divergence

is not symmetric, thus it will not penalize the case whose gen-

erated probability is high while the true probability is low,

which is exactly the case of common responses.

In this paper, we propose to utilize the coherence (i.e. sim-

ilarity) between the generated responses and the original

post as an estimation of the true probability, with inspira-

tion comes from the fact that the similarity measure between

post and response embeddings is consistent with human eval-

uation. Speciﬁcally, three kinds of coherence models are

adopted in this paper. Firstly, an unlearned similarity func-

tion, such as cosine similarity, can be directly used as the co-

herence model. Secondly, the previous semantic text match-

ing models can be regarded as good candidates for measur-

ing the coherence between a post and its corresponding re-

sponse. In this paper, we use two pretrained matching func-

tions, i.e., GRU bilinear model

[

Socher et al., 2013

]

and

MatchPyramid

[

Pang et al., 2016

]

, which are representatives

of two different kinds of deep matching models, i.e., repre-

sentation focused methods and interaction focused methods.

Thirdly, an end-to-end dual learning architecture similar to

[

Xia et al., 2016

]

can be adopted to jointly learn the parame-

ters of response generation model and coherence model. Af-

ter that, the coherence model is used as the reward function in

a reinforcement learning framework for optimization, which

will guide the learning process to penalize the case whose

generated probability is high while the true probability is low.

We evaluate the proposed models on two public datasets,

i.e. the Chinese Weibo and the English Subtitle dataset. Ex-

perimental results show that our models signiﬁcantly outper-

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

4567

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38549327

粉丝: 4
资源: 931

强化对话生成中一致性：提升Seq2Seq模型生成质量

ChatGPT模型的对话一致性保持方法.docx

基于序列到序列模型的神经网络构造1

ChatGPT技术的多轮对话生成与上下文一致性控制研究.docx

序列到序列模型中的RNN应用

Seq2Seq模型在对话生成中的应用

序列到序列模型在机器翻译中的应用

序列到序列模型简介与应用

循环神经网络中的序列到序列模型（Seq2Seq）

序列到序列模型训练技巧与实践

ChatGPT对话生成中的多轮对话处理技巧.docx

最新资源