Chinese Semantic Role Labeling with Bidirectional Recurrent Neural
Networks
Zhen Wang, Tingsong Jiang, Baobao Chang, Zhifang Sui
Key Laboratory of Computational Linguistics, Ministry of Education
School of Electronics Engineering and Computer Science, Peking University
Collaborative Innovation Center for Language Ability, Xuzhou 221009 China
wzpkuer@gmail.com, {tingsong, chbb, szf}@pku.edu.cn
Abstract
Traditional approaches to Chinese Seman-
tic Role Labeling (SRL) almost heavily re-
ly on feature engineering. Even worse,
the long-range dependencies in a sentence
can hardly be modeled by these method-
s. In this paper, we introduce bidirection-
al recurrent neural network (RNN) with
long-short-term memory (LSTM) to cap-
ture bidirectional and long-range depen-
dencies in a sentence with minimal fea-
ture engineering. Experimental results on
Chinese Proposition Bank (CPB) show a
significant improvement over the state-of-
the-art methods. Moreover, our model
makes it convenient to introduce hetero-
geneous resource, which makes a further
improvement on our experimental perfor-
mance.
1 Introduction
Semantic Role Labeling (SRL) is defined as the
task to recognize arguments for a given predicate
and assign semantic role labels to them. Because
of its ability to encode semantic information, there
has been an increasing interest in SRL on many
languages (Gildea and Jurafsky, 2002; Sun and Ju-
rafsky, 2004). Figure 1 shows an example in Chi-
nese Proposition Bank (CPB) (Xue and Palmer,
2003), which is a Chinese corpus annotated with
semantic role labels.
Traditional approaches to Chinese SRL often
extract a large number of handcrafted features
from the sentence, even its parse tree, and feed
these features to statistical classifiers such as CRF,
MaxEnt and SVM (Sun and Jurafsky, 2004; Xue,
2008; Ding and Chang, 2008; Ding and Chang,
2009; Sun, 2010). However, these methods suf-
fer from three major problems. Firstly, their per-
formances are heavily dependent on feature engi-
Figure 1: A sentence with semantic roles labeled
from CPB.
neering, which needs domain knowledge and la-
borious work of feature extraction and selection.
Secondly, although sophisticated features are de-
signed, the long-range dependencies in a sentence
can hardly be modeled. Thirdly, a specific anno-
tated dataset is often limited in its scalability, but
the existence of heterogenous resource, which has
very different semantic role labels and annotation
schema but related latent semantic meaning, can
alleviate this problem. However, traditional meth-
ods cannot relate distinct annotation schemas and
introduce heterogeneous resource with ease.
Concerning these problems, in this paper, we
propose bidirectional recurrent neural network
(RNN) with long-short-term memory (LSTM) to
solve the problem of Chinese SRL. Our approach
makes the following contributions:
• We formulate Chinese SRL with bidirection-
al LSTM RNN model. With bidirectional
RNN, the dependencies in a sentence from
both directions can be captured, and with L-
STM architecture, long-range dependencies
can be well modeled. The test results on the
bechmark dataset CPB show a significant im-
provement over the state-of-the-art methods.
• Compared with previous work that relied on
a huge number of handcrafted features, our
model can achieve much better performance
only with minimal feature engineering.
• The framework of our model makes the intro-
duction of heterogeneous resource efficient