Sentiment Analysis by Capsules
∗
Yequan Wang
1
Aixin Sun
2
Jialong Han
3
Ying Liu
4
Xiaoyan Zhu
1
1
State Key Laboratory on Intelligent Technology and Systems
1
Tsinghua National Laboratory for Information Science and Technology
1
Department of Computer Science and Technology, Tsinghua University, Beijing, China
2
School of Computer Science and Engineering, Nanyang Technological University, Singapore
3
Tencent AI Lab, Shenzhen, China
4
School of Engineering, Cardi University, UK
tshwangyequan@gmail.com;axsun@ntu.edu.sg;jialonghan@gmail.com;liuy81@cardi.ac.uk;zxy-dcs@tsinghua.edu.
cn
ABSTRACT
In this paper, we propose RNN-Capsule, a capsule model based
on Recurrent Neural Network (RNN) for sentiment analysis. For
a given problem, one capsule is built for each sentiment category
e.g., ‘positive’ and ‘negative’. Each capsule has an attribute, a state,
and three modules: representation module, probability module, and
reconstruction module. The attribute of a capsule is the assigned
sentiment category. Given an instance encoded in hidden vectors by
a typical RNN, the representation module builds capsule representa-
tion by the attention mechanism. Based on capsule representation,
the probability module computes the capsule’s state probability. A
capsule’s state is active if its state probability is the largest among
all capsules for the given instance, and inactive otherwise. On two
benchmark datasets (i.e., Movie Review and Stanford Sentiment
Treebank) and one proprietary dataset (i.e., Hospital Feedback),
we show that RNN-Capsule achieves state-of-the-art performance
on sentiment classication. More importantly, without using any
linguistic knowledge, RNN-Capsule is capable of outputting words
with sentiment tendencies reecting capsules’ attributes. The words
well reect the domain specicity of the dataset.
ACM Reference Format:
Yequan Wang
1
Aixin Sun
2
Jialong Han
3
Ying Liu
4
Xiaoyan Zhu
1
.
2018. Sentiment Analysis by Capsules. In WWW 2018: The 2018 Web Confer-
ence, April 23–27, 2018, Lyon, France. ACM, New York, NY, USA, 10 pages.
https://doi.org/10.1145/3178876.3186015
1 INTRODUCTION
Sentiment analysis, also known as opinion mining, is the eld of
study that analyzes people’s sentiments, opinions, evaluations, atti-
tudes, and emotions from written languages [
20
,
26
]. Many neural
network models have achieved good performance, e.g., Recursive
Auto Encoder [
33
,
34
], Recurrent Neural Network (RNN) [
21
,
35
],
and Convolutional Neural Network (CNN) [13, 14, 18].
∗
This work was done when Yequan was a visiting Ph.D student at School of Computer
Science and Engineering, Nanyang Technological University, Singapore.
This paper is published under the Creative Commons Attribution 4.0 International
(CC BY 4.0) license. Authors reserve their rights to disseminate the work on their
personal and corporate Web sites with the appropriate attribution.
WWW 2018, April 23–27, 2018, Lyon, France
©
2018 IW3C2 (International World Wide Web Conference Committee), published
under Creative Commons CC BY 4.0 License.
ACM ISBN 978-1-4503-5639-8/18/04.
https://doi.org/10.1145/3178876.3186015
Despite the great success of recent neural network models, there
are some defects. First, existing models focus on, and heavily rely
on, the quality of instance representations. An instance here can be
a sentence, paragraph or document. Using a vector to represent sen-
timent is much limited because opinions are delicate and complex.
The capsule structure in our work gives the model more capacity
to model sentiments. Second, linguistic knowledge such as senti-
ment lexicon, negation words (e.g., no, not, never), and intensity
words (e.g., very, extremely), need to be carefully incorporated into
these models to realize their best potential in terms of prediction
accuracy. However, linguistic knowledge requires signicant eorts
to develop. Further, the developed sentiment lexicon may not be
applicable to some domain specic datasets. For example, when
patients give feedback to hospital services, words like ‘quick’ and
‘caring’ are all considered strong positive words. These words, are
unlikely to be considered strong positive in movie reviews. Our cap-
sule model does not need any linguistic knowledge, and is able to
output words with sentiment tendencies to explain the sentiments.
In this paper, we make the very rst attempt to perform senti-
ment analysis by capsules. A capsule is a group of neurons which
has rich signicance [
30
]. We design each single capsule
1
to contain
an attribute, a state, and three modules (i.e., representation module,
probability module, and reconstruction module).
•
The attribute of a capsule reects its dedicated sentiment cate-
gory, which is pre-assigned when we build the capsule. Depend-
ing on the number of sentiment categories in a given problem,
the same number of capsules are built. For example, Positive
Capsule and Negative Capsule are built for a problem with two
sentiment categories.
•
The state of a capsule, i.e., ‘active’ or ‘inactive’, is determined by
the probability modules of all capsules in the model. A capsule’s
state is ‘active’ if the output of its probability module is the
largest among all capsules.
•
Regarding the three modules, representation module uses the
attention mechanism to build capsule representation; Proba-
bility module uses the capsule representation to predict the
capsule’s state probability; Reconstruction module is used to
rebuild the representation of the input instance. The input in-
stance of a capsule model is a sequence (e.g., a sentence, or a
paragraph). In this work, the input instance representation of
a capsule is computed through RNN.
1
This work was done before the publication of [30]. Capsule in this work is designed dierently
from that in [30].
Track: Web Content Analysis, Semantics and Knowledge
WWW 2018, April 23-27, 2018, Lyon, France