Engineering Applications of Artificial Intelligence 64 (2017) 43–51
Contents lists available at ScienceDirect
Engineering Applications of Artificial Intelligence
journal homepage: www.elsevier.com/locate/engappai
Sprinkled semantic diffusion kernel for word sense disambiguation
Tinghua Wang
a,b,
*, Wei Li
a
, Fulai Liu
a
, Jialin Hua
c
a
School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, PR China
b
Decision Systems and e-Service Intelligence Laboratory, Centre for Artificial Intelligence, Faculty of Engineering and Information Technology, University of Technology
Sydney, Broadway NSW 2007, Australia
c
School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, PR China
a r t i c l e i n f o
Keywords:
Word sense disambiguation (WSD)
Semantic diffusion kernel
Class information
Support vector machine (SVM)
Kernel method
a b s t r a c t
Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has
been a long-standing research objective for natural language processing (NLP). In this paper, we are concerned
with kernel methods for automatic WSD. Under this framework, the main difficulty is to design an appropriate
kernel function to represent the sense distinction knowledge. Semantic diffusion kernel, which models semantic
similarity by means of a diffusion process on a graph defined by lexicon and co-occurrence information to smooth
the typical ‘‘Bag of Words’’ (BOW) representation, has been successfully applied to WSD. However, the diffusion
is an unsupervised process, which fails to exploit the class information in a supervised classification scenario. To
address the limitation, we present a sprinkled semantic diffusion kernel to make use of the class knowledge of
training documents in addition to the co-occurrence knowledge. The basic idea is to construct an augmented term-
document matrix by encoding class information as additional terms and appending them to training documents.
Diffusion is then performed on the augmented term-document matrix. In this way, the words belonging to the
same class are indirectly drawn closer to each other, hence the class-specific word correlations are strengthened.
We evaluate our method on several Senseval/Semeval benchmark examples with support vector machine (SVM),
and show that the proposed kernel can significantly improve the disambiguation performance over semantic
diffusion kernel in terms of different measures and yield a competitive result with the state-of-the-art kernel
methods for WSD.
© 2017 Elsevier Ltd. All rights reserved.
1. Introduction
Ambiguity is inherent to human language. Particularly, word sense
ambiguity is prevalent in all natural languages, with a large number of
words having more than one meaning. For instance, the English noun
bank can mean ‘‘sloping raised land, especially along the sides of a river’’ or
‘‘an organization where people and businesses can invest or borrow money,
convert to foreign money, etc. or a building where these services are offered’’.
The correct sense of an ambiguous word can be determined based on
the context where it occurs, and correspondingly the problem of word
sense disambiguation (WSD) is defined as the task of automatically
assigning the most appropriate meaning to a polysemous word in a given
context (Navigli, 2009). As a fundamental semantic understanding
task at the lexical level in natural language processing (NLP), WSD
can benefit many applications such as machine translation, information
retrieval, parsing, and question answering. WSD is considered to be a
key step in order to approach language understanding beyond keyword
*
Corresponding author at: School of Mathematics and Computer Science, Gannan Normal University, Ganzhou 341000, PR China.
E-mail address: wthgnnu@163.com (T. Wang).
matching (Agirre et al., 2014). Although WSD for human is essentially
a subconscious process and presents no difficulties, it is very difficult
to formalize the computational process of disambiguation since it is
classified among ‘‘AI-complete’’ problems (Turdakov, 2010), that is, it
is a task whose solution is at least as hard as the most difficult problems
in artificial intelligence.
Generally, WSD methods can be classified into two types: knowledge-
based and machine learning (Navigli, 2009; Raviv and Markovitch,
2012). Knowledge-based WSD systems exploit the information in a
lexical knowledge base, such as WordNet and Wikipedia, to perform
WSD. These approaches usually pick the sense whose definition is most
similar to the context of the ambiguous word, by means of textual
overlap or using graph-based measures (Abualhaija and Zimmermann,
2016; Agirre et al., 2009; Navigli and Lapata, 2010). Machine learning
approaches, also called corpus-based approaches, do not make use of
any knowledge resources for disambiguation. These approaches range
http://dx.doi.org/10.1016/j.engappai.2017.05.010
Received 30 December 2015; Received in revised form 9 December 2016; Accepted 15 May 2017
0952-1976/© 2017 Elsevier Ltd. All rights reserved.