Sentiment-Aspect Extraction based on Restricted Boltzmann Machines
Linlin Wang
1
, Kang Liu
2⇤
, Zhu Cao
1
, Jun Zhao
2
and Gerard de Melo
1
1
Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
2
National Laboratory of Pattern Recognition, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
{ll-wang13, cao-z13}@mails.tsinghua.edu.cn
,
{kliu, jzhao}@nlpr.ia.ac.cn, gdm@demelo.org
Abstract
Aspect extraction and sentiment analysis
of reviews are both important tasks in
opinion mining. We propose a novel senti-
ment and aspect extraction model based on
Restricted Boltzmann Machines to jointly
address these two tasks in an unsupervised
setting. This model reflects the gener-
ation process of reviews by introducing
a heterogeneous structure into the hidden
layer and incorporating informative priors.
Experiments show that our model outper-
forms previous state-of-the-art methods.
1 Introduction
Nowadays, it is commonplace for people to ex-
press their opinion about various sorts of entities,
e.g., products or services, on the Internet, espe-
cially in the course of e-commerce activities. Ana-
lyzing online reviews not only helps customers ob-
tain useful product information, but also provide
companies with feedback to enhance their prod-
ucts or service quality. Aspect-based opinion min-
ing enables people to consider much more fine-
grained analyses of vast quantities of online re-
views, perhaps from numerous different merchant
sites. Thus, automatic identification of aspects of
entities and relevant sentiment polarities in Big
Data is a significant and urgent task (Liu, 2012;
Pang and Lee, 2008; Popescu and Etzioni, 2005).
Identifying aspect and analyzing sentiment
words from reviews has the ultimate goal of dis-
cerning people’s opinions, attitudes, emotions, etc.
towards entities such as products, services, orga-
nizations, individuals, events, etc. In this con-
text, aspect-based opinion mining, also known as
feature-based opinion mining, aims at extracting
and summarizing particular salient aspects of enti-
ties and determining relevant sentiment polarities
⇤
Corresponding Author: Kang Liu (kliu@nlpr.ia.ac.cn)
from reviews (Hu and Liu, 2004). Consider re-
views of computers, for example. A given com-
puter’s components (e.g., hard disk, screen) and
attributes (e.g., volume, size) are viewed as aspects
to be extracted from the reviews, while sentiment
polarity classification consists in judging whether
an opinionated review expresses an overall posi-
tive or negative opinion.
Regarding aspect identification, previous meth-
ods can be divided into three main categories:
rule-based, supervised, and topic model-based
methods. For instance, association rule-based
methods (Hu and Liu, 2004; Liu et al., 1998)
tend to focus on extracting product feature words
and opinion words but neglect connecting product
features at the aspect level. Existing rule-based
methods typically are not able to group the ex-
tracted aspect terms into categories. Supervised
(Jin et al., 2009; Choi and Cardie, 2010) and semi-
supervised learning methods (Zagibalov and Car-
roll, 2008; Mukherjee and Liu, 2012) were intro-
duced to resolve certain aspect identification prob-
lems. However, supervised training requires hand-
labeled training data and has trouble coping with
domain adaptation scenarios.
Hence, unsupervised methods are often adopted
to avoid this sort of dependency on labeled data.
Latent Dirichlet Allocation, or LDA for short,
(Blei et al., 2003) performs well in automatically
extracting aspects and grouping corresponding
representative words into categories. Thus, a num-
ber of LDA-based aspect identification approaches
have been proposed in recent years (Brody and El-
hadad, 2010; Titov and McDonald, 2008; Zhao et
al., 2010). Still, these methods have several im-
portant drawbacks. First, inaccurate approxima-
tions of the distribution over topics may reduce the
computational accuracy. Second, mixture models
are unable to exploit the co-occurrence of topics
to yield high probability predictions for words that
are sharper than the distributions predicted by in-