Learning Domain Differences
Automatically for Dependency Parsing Adaptation
Mo Yu, Tiejun Zhao, and Yalong Bai
Harbin Institute of Technology
China
{yumo, tjzhao, ylbai}@mtlab.hit.edu.cn
Abstract
In this paper, we address the relation between do-
main differences and domain adaptation for depen-
dency parsing. Our quantitative analyses showed
that it is the inconsistent behavior of same features
cross-domain, rather than word or feature cover-
age, that is the major cause of performances de-
crease of out-domain model. We further studied
those ambiguous features in depth and found that
the set of ambiguous features is small and has con-
centric distributions. Based on the analyses, we
proposed a DA method. The DA method can auto-
matically learn which features are ambiguous cross
domain according to errors made by out-domain
model on in-domain training data. Our method is
also extended to utilize multiple out-domain mod-
els. The results of dependency parser adaptation
from WSJ to Genia and Question bank showed that
our method achieved significant improvements on
small in-domain datasets where DA is mostly in
need. Additionally, we achieved improvement on
the published best results of CoNLL07 shared task
on domain adaptation, which confirms the signifi-
cance of our analyses and our method.
1 Introduction
Statistical models are widely used in the field of dependency
parsing. However, current models are usually trained and
tested on data from the same domain. When test data belongs
to a domain different from the training data, the performances
of the current dependency parsing models will be greatly de-
graded. Therefore when the labeled Treebank of the target
domain is insufficient, it is difficult to obtain accurate parsing
results on this domain.
To quickly adapt parser to new domains where few in-
domain labeled data is available, various techniques have
been proposed. No labeled data from target domain is needed
for most parser domain adaptation (DA) methods, e.g. self-
training [McClosky et al., 2008; Sagae, 2010], co-training
[Steedman et al., 2003; Sagae and Tsujii, 2007] and word
clustering approach
[
Candito et al., 2011
]
. These unsu-
pervised methods improve performances by helping parsers
cover more domain specific words or features
[
McClosky and
Charniak, 2008
]
.
However, as will be shown in this paper, word and feature
coverges is not the only factor affecting cross-domain per-
formance. Specifically, we take WSJ corpus and Genia cor-
pus
[
Tateisi et al., 2005
]
as examples. During analysis, even
though we added gold POS tags and made the gap of word
coverage no longer exist, performance decline is still not al-
leviated much. It is now the ambiguous feature, behaving in-
constantly in different domains, that brings the performance
drop. In addition, Dredze et al. [2007] pointed out that do-
main differences may exist in different annotation guidelines
between Treebanks.
Above findings indicated that some labeled data was in
need for handling such differences. Unlike unsupervised
methods with difficulty to detect and handle these differ-
ences, current supervised and semi-supervised parser adapta-
tion
[
Hall et al., 2011
]
are proved to get better results. How-
ever, they do not directly solve the domain differences on fea-
tures discussed above.
In this paper, we try to learn which features are ambiguous
between domains with the help of only a small in-domain la-
beled dataset. The key idea is to learn which features are more
likely to be associated with errors based on in-domain train-
ing data. Then the model could identify and correct the un-
reliable arcs based on the ambiguous features they contained,
while still keeping as many reliable arcs outputted by out-
domain model as possible.
There are two major contributions in this paper. First, in
Section 2, quantitative analyses are performed to find out
which types of domain differences affect the cross-domain
performances of a parser mostly. As far as we know, few
works
[
Gildea, 2001
]
had focused on this problem. Second,
based on some general rules found in the analyses, in Section
3, we proposed a method to automatically learn domain dif-
ferences from small in-domain dataset, meanwhile avoiding
overfitting. Results of experiments are shown in Section 4.
Section 5 gives the conclusion.
2 Analysis on Domain Differences
2.1 Experimental Settings
In this section, we set Genia and WSJ corpus as in-domian
and out-domain data respectively. For WSJ corpus, sections
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence