Attention-Based Bidirectional Long Short-Term Memory Networks for
Relation Classification
Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi
∗
, Bingchen Li, Hongwei Hao, Bo Xu
Institute of Automation, Chinese Academy of Sciences
{zhoupeng2013, shiwei2013, tianjun2013, zhenyu.qi,
libingchen2013, hongwei.hao, xubo}@ia.ac.cn
Abstract
Relation classification is an important se-
mantic processing task in the field of nat-
ural language processing (NLP). State-of-
the-art systems still rely on lexical re-
sources such as WordNet or NLP systems
like dependency parser and named entity
recognizers (NER) to get high-level fea-
tures. Another challenge is that important
information can appear at any position in
the sentence. To tackle these problems,
we propose Attention-Based Bidirectional
Long Short-Term Memory Networks(Att-
BLSTM) to capture the most important se-
mantic information in a sentence. The ex-
perimental results on the SemEval-2010
relation classification task show that our
method outperforms most of the existing
methods, with only word vectors.
1 Introduction
Relation classification is the task of finding seman-
tic relations between pairs of nominals, which is
useful for many NLP applications, such as infor-
mation extraction (Wu and Weld, 2010), question
answering (Yao and Van Durme, 2014). For in-
stance, the following sentence contains an exam-
ple of the Entity-Destination relation between the
nominals Flowers and chapel.
⟨e
1
⟩ Flowers ⟨/e
1
⟩ are carried into the ⟨e
2
⟩
chapel ⟨/e
2
⟩.
⟨e
1
⟩, ⟨/e
1
⟩, ⟨e
2
⟩, ⟨/e
2
⟩ are four position indica-
tors which specify the starting and ending of the
nominals (Hendrickx et al., 2009).
Traditional relation classification methods that
employ handcrafted features from lexical re-
sources, are usually based on pattern matching,
and have achieved high performance (Bunescu
∗
Correspondence author: zhenyu.qi@ia.ac.cn
and Mooney, 2005; Mintz et al., 2009; Rink and
Harabagiu, 2010). One downside of these meth-
ods is that many traditional NLP systems are uti-
lized to extract high-level features, such as part of
speech tags, shortest dependency path and named
entities, which consequently results in the increase
of computational cost and additional propagation
errors. Another downside is that designing fea-
tures manually is time-consuming, and performing
poor on generalization due to the low coverage of
different training datasets.
Recently, deep learning methods provide an ef-
fective way of reducing the number of handcrafted
features (Socher et al., 2012; Zeng et al., 2014).
However, these approaches still use lexical re-
sources such as WordNet (Miller, 1995) or NLP
systems like dependency parsers and NER to get
high-level features.
This paper proposes a novel neural network Att-
BLSTM for relation classification. Our model uti-
lizes neural attention mechanism with Bidirection-
al Long Short-Term Memory Networks(BLSTM)
to capture the most important semantic informa-
tion in a sentence. This model doesn’t utilize any
features derived from lexical resources or NLP
systems.
The contribution of this paper is using BLST-
M with attention mechanism, which can automat-
ically focus on the words that have decisive effect
on classification, to capture the most important se-
mantic information in a sentence, without using
extra knowledge and NLP systems. We conduct
experiments on the SemEval-2010 Task 8 dataset,
and achieve an F 1-score of 84.0%, higher than
most of the existing methods in the literature.
The remainder of the paper is structured as fol-
lows. In Section 2, we review related work about
relation classification. Section 3 presents our Att-
BLSTM model in detail. In Section 4, we describe
details about the setup of experimental evaluation