查询任意事实：从不完整数据库中推断真理

需积分: 0 32 浏览量更新于2024-09-05 收藏 440KB PDF 举报

"Philosophers are Mortal- Inferring the Truth of Unseen Facts(CONLL13).pdf" 这篇论文探讨的是在大规模事实数据库中的常识推理问题。随着数据库覆盖范围的扩大，尽管准确性得以保持，但完整性却逐渐降低。作者提出了一种新的系统，该系统可以查询数据库是否包含任意特定的事实，而不是像传统的开放领域信息抽取那样不断扩展已知事实的数据库。这一方法是将已有数据“平滑”处理，形成一个任何可能的事实都有一定置信度的数据库。论文的关键词是“常识推理”，这意味着系统设计的目标是处理那些未被明确记录但仍可基于常识推断的事实。作者Gabor Angeli和Christopher D. Manning来自斯坦福大学，他们通过这个系统来评估对未见过的事实进行预测的能力，达到了74.2%的准确率，并且优于多个基线系统。系统不仅用于预测新事实，还被用作ReVerb OpenIE系统的常识过滤器，以及问答任务中的答案验证方法。ReVerb是一种开放信息抽取工具，而问答任务则需要系统能够确认提供的答案是否正确或合理。在问答任务中，系统通过验证答案是否符合常识逻辑，提高了回答的质量和可靠性。 1. 引言部分提到，事实数据库（如Freebase或OpenIE提取）在各种应用中非常常见。这些数据库虽然准确，但随着涵盖的信息增多，不完整性的挑战也随之增加。因此，作者提出的新方法旨在解决这个问题，通过推理来填补数据库中的空白。 2. 系统的核心在于构建一个包含所有可能事实的数据库，每个事实都有一个与之相关的置信度。这使得系统能够在未知事实面前做出概率性的判断，增强了数据库的全面性。 3. 实验部分，系统展示了在预测未知事实上的高准确率，这表明其推理能力强大，能够有效地处理未在数据库中出现过的知识。 4. 作为ReVerb的常识过滤器，系统能帮助去除不符合常识的OpenIE提取结果，提高信息抽取的准确性和可信度。 5. 在问答任务中，系统利用其推理能力验证答案的合理性，确保了回答的质量，这对于自动问答系统尤其重要，因为这类系统需要能够提供准确且符合逻辑的答案。这篇论文提出的系统在处理大规模事实数据库的不完整性上提供了创新的解决方案，通过常识推理增强了数据库的实用性和全面性，并在实际应用中取得了显著的效果。

Philosophers are Mortal: Inferring the Truth of Unseen Facts

Gabor Angeli

Stanford University

Stanford, CA 94305

angeli@stanford.edu

Christopher D. Manning

Stanford University

Stanford, CA 94305

manning@stanford.edu

Abstract

Large databases of facts are prevalent in

many applications. Such databases are

accurate, but as they broaden their scope

they become increasingly incomplete. In

contrast to extending such a database, we

present a system to query whether it con-

tains an arbitrary fact. This work can be

thought of as re-casting open domain in-

formation extraction: rather than growing

a database of known facts, we smooth this

data into a database in which any possi-

ble fact has membership with some conﬁ-

dence. We evaluate our system predicting

held out facts, achieving 74.2% accuracy

and outperforming multiple baselines. We

also evaluate the system as a common-

sense ﬁlter for the ReVerb Open IE sys-

tem, and as a method for answer validation

in a Question Answering task.

1 Introduction

Databases of facts, such as Freebase (Bollacker

et al., 2008) or Open Information Extraction

(Open IE) extractions, are useful for a range of

NLP applications from semantic parsing to infor-

mation extraction. However, as the domain of a

database grows, it becomes increasingly impracti-

cal to collect completely, and increasingly unlikely

that all the elements intended for the database are

explicitly mentioned in the source corpus. In par-

ticular, common-sense facts are rarely explicitly

mentioned, despite their abundance. It would be

useful to infer the truth of such unseen facts rather

than assuming them to be implicitly false.

A growing body of work has focused on auto-

matically extending large databases with a ﬁnite

set of additional facts. In contrast, we propose

a system to generate the (possibly inﬁnite) com-

pletion of such a database, with a degree of con-

ﬁdence for each unseen fact. This task can be

cast as querying whether an arbitrary element is

a member of the database, with an informative de-

gree of conﬁdence. Since often the facts in these

databases are devoid of context, we reﬁne our no-

tion of truth to reﬂect whether we would assume

a fact to be true without evidence to the contrary.

In this vein, we can further reﬁne our task as de-

termining whether an arbitrary fact is plausible –

true in the absence contradictory evidence.

In addition to general applications of such large

databases, our approach can further be integrated

into systems which can make use of probabilis-

tic membership. For example, certain machine

translation errors could be ﬁxed by determining

that the target translation expresses an implausible

fact. Similarly, the system can be used as a soft

feature for semantic compatibility in coreference;

e.g., the types of phenomena expressed in Hobbs’

selectional constraints (Hobbs, 1978). Lastly, it is

useful as a common-sense ﬁlter; we evaluate the

system in this role by ﬁltering implausible facts

from Open IE extractions, and ﬁltering incorrect

responses for a question answering system.

Our approach generalizes word similarity met-

rics to a notion of fact similarity, and judges the

membership of an unseen fact based on the aggre-

gate similarity between it and existing members

of the database. For instance, if we have not seen

the fact that philosophers are mortal

but we know

that Greeks are mortal, and that philosophers and

Greeks are similar, we would like to infer that the

fact is nonetheless plausible.

We implement our approach on both a large

open-domain database of facts extracted from the

Open IE system ReVerb (Fader et al., 2011), and

ConceptNet (Liu and Singh, 2004), a hand curated

database of common sense facts.

This is an unseen fact in http://openie.cs.

washington.edu.

不出现

在这方面

不合情理的

依然

下载后可阅读完整内容，剩余9页未读，立即下载

xiaomao979

粉丝: 14

查询任意事实：从不完整数据库中推断真理

The Little Book of Semaphores

philosophers-stone,philosophers-stone

philosophers-problem-deadlock

Philosophers-Multi-Threading-Problem:哲学家问题的解决方案

Dining-philosophers-python

Petri-Net-Analysis-on-Dining-Philosophers-Problem

B3M5导学案.pdf

linux多线程实验报告.pdf

Moral Machines_Teaching Robots Right from Wrong.pdf

电子行业企业管理免费新东方考研英语长难句快速突破电子版教材.pdf

最新资源