Encoding World Knowledge in the Evaluation of Local Coherence
Muyu Zhang
1∗
, Vanessa Wei Feng
2
, Bing Qin
1
, Graeme Hirst
2
, Ting Liu
1
and Jingwen Huang
1
1
Research Center for Social Computing and Information Retrieval
Harbin Institute of Technology, Harbin, China
2
Department of Computer Science, University of Toronto, Toronto, ON, Canada
{myzhang,qinb,tliu,jwhuang}@ir.hit.edu.cn
{weifeng,gh}@cs.toronto.edu
Abstract
Previous work on text coherence was primar-
ily based on matching multiple mentions of
the same entity in different parts of the text;
therefore, it misses the contribution from se-
mantically related but not necessarily coref-
erential entities (e.g., Gates and Microsoft).
In this paper, we capture such semantic relat-
edness by leveraging world knowledge (e.g.,
Gates is the person who created Microsoft),
and use two existing evaluation frameworks.
First, in the unsupervised framework, we in-
troduce semantic relatedness as an enrichment
to the original graph-based model of Guin-
audeau and Strube (2013). In addition, we
incorporate semantic relatedness as additional
features into the popular entity-based model
of Barzilay and Lapata (2008). Across both
frameworks, our enriched model with seman-
tic relatedness outperforms the original meth-
ods, especially on short documents.
1 Introduction
In a well-written document, sentences are organized
and presented in a logical and coherent form, which
makes the text fluent and easily understood. There-
fore, coherence is a fundamental aspect of high text
quality, and the evaluation of coherence is a crucial
component of many NLP applications, such as essay
scoring (Miltsakaki and Kukich, 2004), story gener-
ation (McIntyre and Lapata, 2010), and document
summarization (Barzilay et al., 2002).
∗
This work was partly done while the first author was vis-
iting University of Toronto.
A particularly popular model for evaluating text
coherence is the entity-based local coherence model
of Barzilay and Lapata (2008) (B&L), which ex-
tracts mentions of entities in adjacent sentences, and
captures local coherence in terms of the transitions
in the grammatical role of each mention. Follow-
ing this direction, a number of extensions have been
proposed (Elsner and Charniak, 2008; Elsner and
Charniak, 2011; Lin et al., 2011; Feng et al., 2014),
the majority of which focus on enriching the origi-
nal entity features. An exception is the unsupervised
model of Guinaudeau and Strube (2013) (G&S),
which converts the document into a graph of sen-
tences, and evaluates the text coherence by comput-
ing the average out-degree over the entire graph.
However, despite the apparent success of these
methods, they rely merely on matching mentions of
the same entity, but neglect the contribution from
semantically related but not necessarily coreferen-
tial entities. For example, the text in Figure 1a
1
has
no common entity in s
2
and s
3
. However, the tran-
sition between them is perfectly coherent, because
there exists close semantic relatedness between two
distinct entities, Gates in s
2
and Microsoft in s
3
,
which can be captured by the world knowledge that
Gates is the person who created Microsoft (repre-
sented by Gates-create-Microsoft). In fact, the is-
sue of absence of common entities between adjacent
sentences is quite prevalent. Analyzing the CoNLL
2012 dataset (Pradhan et al., 2012), we found that
42.34% of the time, adjacent sentences do not share
common entities. As a result, methods which rely
on strict entity matching would fail on these cases.
1
Based on a news item: http://www.cnbc.com/id/101576926