从异构网络学习表示进行产品评论情感分类

63 浏览量更新于2024-08-28 收藏 2.14MB PDF 举报

“Learning Representations from Heterogeneous Network for Sentiment Classification of Product Reviews”是一篇研究论文，主要探讨了如何从异构网络中学习表示来改进产品评论的情感分类。该研究指出，自然语言处理领域对在产品评论的情感分类中学习更有效的文本表示方法的兴趣日益增加。然而，大多数现有方法并未充分考虑文本中的微妙信息，如词汇的上下文关系、用户行为模式以及产品特征等。为了弥补这一不足，作者提出了一种利用异构网络学习表示的方法。异构网络是指包含多种类型节点（如用户、产品、词汇）和不同类型边的网络结构，能够捕获复杂的关系和语义。论文的核心是将情感分类问题转化为一个从异构网络中学习节点（如词汇和产品）表示的过程。通过这种方式，模型可以捕捉到不同节点之间的关联，比如用户对产品的评价、词汇的共现模式等。这有助于提取更有意义的特征，从而提高情感分析的准确性。论文提到了几种关键技术，包括网络嵌入（Network Embedding），这是一种将网络中的节点映射到低维向量空间的技术，保留网络的结构信息。此外，他们可能还应用了深度学习方法，如卷积神经网络（CNN）或递归神经网络（RNN），来处理文本数据并学习高级的语义表示。在实验部分，作者可能使用了真实的产品评论数据集进行验证，并与其他现有的情感分类方法进行了对比。结果显示，他们的方法在保持或提高性能的同时，能够处理更丰富的信息，特别是在处理复杂的语境和多模态数据时。关键词包括情感分类、表示学习、网络嵌入和产品评论，这表明论文的重点在于利用网络结构信息提升文本情感分析的性能，尤其关注产品评论这一特定应用场景。总结来说，这篇研究论文为情感分析提供了一个新的视角，即通过异构网络学习，有效地整合和理解产品评论中的多源信息，以实现更准确的情感分类。这种方法对于提升电子商务、社交媒体分析等领域的情感智能具有重要的理论与实践价值。

36 L. Gui et al. / Knowledge-Based Systems 124 (2017) 34–45

2.2. Sentiment classiﬁcation in product reviews

If we focus on the sentiment classiﬁcation in product reviews,

the methods can be categorized as text classiﬁcation based meth-

ods and users/products modeling based methods.

2.2.1. Text classiﬁcation based method

In principle, any text classiﬁcation method can be implemented

in sentiment classiﬁcation. Training a classiﬁer (such as SVM) with

unigrams, bi-grams and trigrams as features is a strong baseline

for sentiment classiﬁcation [14] . Beside the text features above, the

sentiment lexicon features or sentiment-speciﬁc word embeddings

are also important in sentiment classiﬁcation [18] .

In recent years, the deep learning based method have achieved

the state-of-the-art performance on this problem. Recursive Neural

Tensor Network [8] , Convolutional Neural Network [7] and Gated

Recurrent Neural Network [9] achieved great success. The Recur-

sive Neural Tensor Network [8] composes words into sentences by

sharing parameters and syntactic structure. Convolutional Neural

Network [7] uses convolutional ﬁlter to extract phrase level fea-

tures, then implement a pooling operation to select the most rele-

vant features to model a sentences. In Gated Recurrent Neural Net-

work [9] , sentences can be regard as a sequence of words, and any

sequence can be modeled by the last word and the previous sub

sequence. These deep learning methods achieves state-of-the-art

performance in different data set of sentiment classiﬁcation task

in product reviews.

However, the text classiﬁcation based methods do not consider

the user or product information, which is important for sentiment

classiﬁcation in product reviews.

2.2.2. User modeling based method

Apart from text, user information can also be used in sentiment

classiﬁcation. Gao et al. [28] designed user speciﬁc features to cap-

ture opinion holder leniency. Dong et al. [29] incorporated textual

topics and user-word factors into supervised topic modeling. Hovy

[30] used demographic information in sentiment analysis. Tan et al.

[31] and Hu et al. [32] used user-user relationships for Twitter sen-

timent analysis.

Here, the deep learning method incorporating with user infor-

mation achieves the best performance. For example, Tang et al.

[33] incorporated the user and product information into convo-

lutional neural networks for sentiment analysis. Recently, we also

proposed a method to model users based on the concept of inter-

subjectivity [11] . The basic idea is to learn user embeddings based

on their shared words in reviews which have similar polarities.

However, what has been overlooked in these two methods is that

the meanings of words may be also impacted by users and prod-

ucts (opinion targets) and hence representations of words should

be updated together with user and product representations.

In summary, existing methods mainly considered three types of

user information, namely: (1) personal proﬁles such as ages, gen-

der, etc., to characterize users; (2) latent topics extracted from text

as a proxy measure of users sharing similar topical interests; (3)

rating patterns of users on product reviews. None of the above

methods considered the subtle interplays between users and the

words used by users sharing similar sentiments.

In this paper, we ﬁrst propose a heterogeneous network embed-

ding method to represent words, users and products in a uniﬁed

embedding space based on the constructed heterogeneous network

built on the word level. As will be shown in our experiments, three

kinds of embeddings learned offer a better performance on the

text classiﬁcation based method. The incorporating of user/product

embeddings and convolutional neural network improves the user

modeling based method and gives the state-of-the-art results on

three product review datasets in document-level sentiment classi-

ﬁcation.

Our approach

Our approach is inspired by the theory in sociology. In this

section, we will ﬁrst deﬁne our problem, and then discuss how

to construct a heterogeneous network from text and other infor-

mation. Next, we will present the network embedding method to

model users, products and words in the same embedding space.

Finally, we will explain how to incorporate the learned representa-

tions into a CNN for sentiment classiﬁcation.

3.1. Problem setup

In the corpora used in our experiments, we assume there are

a total of D review articles written by | U | users for | P | products.

Here, U and P stand for the set of users or products, | · | means

the size of set. Each review article x

∈ X is represented by 3-tuple

consisting of its author (or user) u

, the opinion target (or product)

and the text content d

, i.e.,

= { u

, p

, d

} , i ∈ { 1 , 2 , . . . , D } (1)

Note that a user may post multiple reviews, and a product may

receive reviews from different users.

We also assume that a review document d

contains L

words,

= { w

, w

, . . . , w

} , where w

is the j th word in document d

Given a ﬁxed set of sentiment classes Y = { y

, y

, . . . , y

} , the goal

of sentiment classiﬁcation is to training a function F to map re-

views to sentiment classes:

F : X → Y (2)

3.2. Word/user/product representation learning with network

embedding

For learning representations of words, users and products, it is

important to deﬁne context for each of them. In traditionally word

representation learning methods, such as continuous bag-of-words

(CBOW) or skip gram, the context for each word is typically de-

ﬁned as a 5-word window (two words before and after the tar-

get word). In our previously proposed user representation learning

method [11] , users should be similar to each other if they share

similar subjective terms. Hence, the context for users is deﬁned

as their shared subjective terms in their reviews. In our work here,

we argue that there exist subtle interplays among words, users and

products. For example, words such as ‘ freezes ’ and ‘ hangs ’ are used

often in negative reviews towards mobile phones. These two words

should carry similar semantic meanings and hence their represen-

tations should be placed in nearby locations in the embedding

space. As such, word, user and product representation learning

should be performed simultaneously to map them into a uniﬁed

embedding space.

We propose to ﬁrst build a heterogeneous network, in which

words, users, products, as well as sentiment labels are vertices and

statistical relations between them are edges. Then we use a net-

work embedding method to learn the distributed representation of

each vertex including words, users and products.

3.2.1. Construction of heterogeneous network

Here, we deﬁne the network as a graph: G = { E, V } . The E is the

set of edges and V is the set of vertices, which is the union of all

words, users, and products. We need to learn the weight of each

edge, deﬁne as ω(e ) , e = { u, v } , e ∈ E and u, v ∈ V here.

In order to capture the information of users and products, we

have included an additional type of vertices, polarities. Four cate-

gories of relations should be considered in the construction of het-

erogeneous network ( Fig. 1 ):

剩余11页未读，继续阅读

weixin_38606169

粉丝: 4
资源: 957

从异构网络学习表示进行产品评论情感分类

learning representations by back-propagating errors

Deep Image Retrieval:Learning global representations for image search

deep closest point: learning representations for point cloud registration

learning representations in model-free hierarchical reinforcement learning

Deep Closest Point: Learning Representations for Point Cloud Registration中用的是什么注意力机制

Contrastive Learning

stacked convolutional sparse denoising autoencoder

Very Deep Convolutional Networks for Large-Scale Image Recognition" by Karen Simonyan and Andrew Zisserman (2014)

cNN autoencoder

CV transformer

最新资源