Discriminative Reordering with Chinese Grammatical Relations Features
Pi-Chuan Chang
a
, Huihsin Tseng
b
, Dan Jurafsky
a
, and Christopher D. Manning
a
a
Computer Science Department, Stanford University, Stanford, CA 94305
b
Yahoo! Inc., Santa Clara, CA 95054
{pichuan,jurafsky,manning}@stanford.edu, huihui@yahoo-inc.com
Abstract
The prevalence in Chinese of grammatical
structures that translate into English in dif-
ferent word orders is an important cause of
translation difficulty. While previous work has
used phrase-structure parses to deal with such
ordering problems, we introduce a richer set of
Chinese grammatical relations that describes
more semantically abstract relations between
words. Using these Chinese grammatical re-
lations, we improve a phrase orientation clas-
sifier (introduced by Zens and Ney (2006))
that decides the ordering of two phrases when
translated into English by adding path fea-
tures designed over the Chinese typed depen-
dencies. We then apply the log probabil-
ity of the phrase orientation classifier as an
extra feature in a phrase-based MT system,
and get significant BLEU point gains on three
test sets: MT02 (+0.59), MT03 (+1.00) and
MT05 (+0.77). Our Chinese grammatical re-
lations are also likely to be useful for other
NLP tasks.
1 Introduction
Structural differences between Chinese and English
are a major factor in the difficulty of machine trans-
lation from Chinese to English. The wide variety
of such Chinese-English differences include the or-
dering of head nouns and relative clauses, and the
ordering of prepositional phrases and the heads they
modify. Previous studies have shown that using syn-
tactic structures from the source side can help MT
performance on these constructions. Most of the
previous syntactic MT work has used phrase struc-
ture parses in various ways, either by doing syntax-
directed translation to directly translate parse trees
into strings in the target language (Huang et al.,
2006), or by using source-side parses to preprocess
the source sentences (Wang et al., 2007).
One intuition for using syntax is to capture dif-
ferent Chinese structures that might have the same
(a)
(ROOT
(IP
(LCP
(QP (CD Կ
)
(CLP (M ڣ)))
(LC 䝢))
(PU Δ)
(NP
(DP (DT 㪤ࠄ))
(NP (NN ৄؑ)))
(VP
(ADVP (AD ี儳))
(VP (VV ݙګ)
(NP
(NP
(ADJP (JJ ࡐࡳ))
(NP (NN 凹䣈)))
(NP (NN ދ凹)))
(QP (CD ԫۍԲԼ䣐)
(CLP (M ց)))))
(PU Ζ)))
(b)
(ROOT
(IP
(NP
(DP (DT 㪤ࠄ))
(NP (NN ৄؑ)))
(VP
(LCP
(QP (CD Կ
)
(CLP (M ڣ)))
(LC 䝢))
(ADVP (AD ี儳))
(VP (VV ݙګ)
(NP
(NP
(ADJP (JJ ࡐࡳ))
(NP (NN 凹䣈)))
(NP (NN ދ凹)))
(QP (CD ԫۍԲԼ䣐)
(CLP (M ց)))))
(PU Ζ)))
Կ
ڣ
䝢ৄؑ
ݙګ
ี儳ދ凹ց
㪤ࠄ凹䣈
ࡐࡳ
ԫۍԲԼ䣐
Կ (three)
ڣ (year)
䝢 (over; in) ৄؑ (city)
ݙګ
(complete)
ี儳
(collectively)
ދ凹
(invest)
ց
(yuan)
㪤ࠄ (these) 凹䣈
(asset)
ࡐࡳ
(fixed)
ԫۍԲԼ䣐
(12 billion)
loc nsubj advmod dobj range
lobj det
nn
nummod
amod
nummod
Figure 1: Sentences (a) and (b) have the same mean-
ing, but different phrase structure parses. Both sentences,
however, have the same typed dependencies shown at the
bottom of the figure.
meaning and hence the same translation in English.
But it turns out that phrase structure (and linear or-
der) are not sufficient to capture this meaning rela-
tion. Two sentences with the same meaning can have
different phrase structures and linear orders. In the
example in Figure 1, sentences (a) and (b) have the
same meaning, but different linear orders and dif-
ferent phrase structure parses. The translation of
sentence (a) is: “In the past three years these mu-
nicipalities have collectively put together investment
in fixed assets in the amount of 12 billion yuan.” In
sentence (b), “in the past three years” has moved its