知识图谱驱动的多句文本生成：图Transformer方法

需积分: 10 154 浏览量更新于2024-09-08 收藏 443KB PDF 举报

"文本生成从知识图谱到复杂思想的表达是一个关键任务，特别是在处理跨越多句子的信息传递时。传统的手动创建文档结构（即文档计划）成本高昂且效率低下，无法满足这种需求。论文《Text Generation from Knowledge Graphs with Graph Transformers》探讨了如何利用知识图谱进行多句子文本生成，这是一种在计算机科学中广泛存在的信息表示形式，但其非层次结构、长距离依赖的压缩以及结构多样性给现有文本生成技术带来了挑战。作者们针对这个问题提出了一个新颖的图Transformer编码器。这个编码器旨在充分利用知识图谱中的关系结构，同时避免线性化（将图形转换为一维序列）或强加层级结构的限制。通过这种方法，他们试图打破生成过程对传统文本结构的依赖，允许模型更好地理解和处理复杂的知识连接，从而生成更连贯、内容丰富的多句子文本。该研究的关键创新在于设计了一种能够动态地探索和理解知识图谱中节点和边之间复杂关系的算法。这可能包括实体间的关联、事件的因果链、或者属性之间的交互。图Transformer编码器可能运用注意力机制来聚焦于关键节点和路径，确保生成的文本流畅且信息准确。此外，文中可能会涉及训练策略和评估指标，例如利用无监督学习或强化学习来优化模型，以及通过BLEU、ROUGE等自动评价工具来衡量生成文本与人类编写的参考文本之间的相似度。研究者还可能讨论了如何处理知识图谱中的噪声和不确定性，以及如何在实际应用中平衡生成质量与效率的问题。这篇论文提供了构建自适应、灵活且有效的文本生成系统的新方法，为从知识图谱中提取信息后生成高质量的多句子文本铺平了道路，有望在新闻摘要、故事生成、问答系统等领域产生深远影响。"

Text Generation from Knowledge Graphs with Graph Transformers

Rik Koncel-Kedziorski

, Dhanush Bekal

, Yi Luan

, Mirella Lapata

, and Hannaneh Hajishirzi

1,3

University of Washington

{kedzior,dhanush,luanyi,hannaneh}@uw.edu

University of Edinburgh

mlap@inf.ed.ac.uk

Allen Institute for Artiﬁcial Intelligence

Abstract

Generating texts which express complex ideas

spanning multiple sentences requires a struc-

tured representation of their content (docu-

ment plan), but these representations are pro-

hibitively expensive to manually produce. In

this work, we address the problem of gener-

ating coherent multi-sentence texts from the

output of an information extraction system,

and in particular a knowledge graph. Graph-

ical knowledge representations are ubiquitous

in computing, but pose a signiﬁcant challenge

for text generation techniques due to their

non-hierarchical nature, collapsing of long-

distance dependencies, and structural variety.

We introduce a novel graph transforming en-

coder which can leverage the relational struc-

ture of such knowledge graphs without impos-

ing linearization or hierarchical constraints.

Incorporated into an encoder-decoder setup,

we provide an end-to-end trainable system

for graph-to-text generation that we apply to

the domain of scientiﬁc text. Automatic and

human evaluations show that our technique

produces more informative texts which ex-

hibit better document structure than competi-

tive encoder-decoder methods.

1 Introduction

Increases in computing power and model capac-

ity have made it possible to generate mostly-

grammatical sentence-length strings of natural

language text. However, generating several sen-

tences related to a topic and which display over-

all coherence and discourse-relatedness is an open

challenge. The difﬁculties are compounded in do-

mains of interest such as scientiﬁc writing. Here

the variety of possible topics is great (e.g. top-

ics as diverse as driving, writing poetry, and pick-

ing stocks are all referenced in one subﬁeld of

Data and code available at https://github.com/

rikdz/GraphWriter

Our Model outperforms

HMM models by 15% on

this data.

used-for

comparison

We present a CRF Model

for Event Detection.

CRF Model

Event Detection

SemEval 2011

Task 11

used

for

We evaluate this model

on SemEval 2010 Task 11

evaluate-for

evaluate

for

evaluate

for

HMM Models

comparison

Title: Event Detection with Conditional Random Fields

Abstract

Graph

Figure 1: A scientiﬁc text showing the annotations of

an information extraction system and the correspond-

ing graphical representation. Coreference annotations

shown in color. Our model learns to generate texts from

automatically extracted knowledge using a graph en-

coder decoder setup.

one scientiﬁc discipline). Additionally, there are

strong constraints on document structure, as sci-

entiﬁc communication requires carefully ordered

explanations of processes and phenomena.

Many researchers have sought to address these

issues by working with structured inputs. Data-to-

text generation models (Konstas and Lapata, 2013;

Lebret et al., 2016; Wiseman et al., 2017; Pudup-

pully et al., 2019) condition text generation on

table-structured inputs. Tabular input representa-

tions provide more guidance for producing longer

texts, but are only available for limited domains

as they are assembled at great expense by manual

annotation processes.

The current work explores the possibility of us-

ing information extraction (IE) systems to auto-

matically provide context for generating longer

texts (Figure 1). Robust IE systems are avail-

able and have support over a large variety of tex-

tual domains, and often provide rich annotations

of relationships that extend beyond the scope of

arXiv:1904.02342v2 [cs.CL] 18 May 2019

下载后可阅读完整内容，剩余9页未读，立即下载

Jayxp

粉丝: 6
资源: 137

知识图谱驱动的多句文本生成：图Transformer方法

神经网络实现图像分类

理解图卷积网络的节点分类

text2Graph:文字以图形化实用程序和测试

Binarized Knowledge Graph Embeddings.pdf

Jointly Learning Explainable Rules for Recommendation with Knowledge Graph.pdf

TuckER：Tensor Factorization for Knowledge Graph Completion.pdf

Explainable Reasoning over Knowledge Graphs for Recommendation.pdf

Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning.pdf

Soft Marginal TransE for Scholarly Knowledge Graph Completion.pdf

Factor Graphs for Robot Perception.pdf

最新资源