基于感知群组的情感倾向分析：中文微文本的新方法

需积分: 10 16 浏览量更新于2024-09-10 1 收藏 284KB PDF 举报

随着中国互联网的飞速发展，微博作为微文本的一种主要形式，已经成为网民表达观点、分享生活点滴的重要平台。在这一背景下，中文微博情感分析（Sentiment Tendency Analysis）的研究变得日益重要，因为理解用户的情绪倾向对于企业舆情监控、产品推荐和市场策略等方面具有显著价值。本文标题"Research on Sentiment Tendency Analysis of Microtext Based on Sense Group"聚焦于一种新颖的方法来解决中文微文本情感倾向分析中的挑战。传统的中文情感分析往往面临困难，特别是由于语言的复杂性，如成语、多义词以及否定词、程度副词和标点符号的使用。这些因素可能导致情感分析结果的不确定性。论文提出了一种基于Sense Group的分割（STDSG）方法来解决这个问题。STDSG首先将微文本划分为独立的意义组（Sense Group），这样可以更好地理解每个部分可能携带的情感信息。接着，利用情感词典来判断每个意义组的情感倾向，这有助于减少歧义的影响。此外，作者考虑了否定词对情感极性反转的影响，例如“不好”实际上表达了负面情绪，而不仅仅是“好”的否定。同时，他们也注意到了程度副词的存在，如“非常”或“稍微”，它们能够增强或减弱原始情感的强度。对于标点符号，尽管通常被视为辅助信息，但在特定语境下也可能影响情感解读。论文通过实验验证了STDSG方法的有效性。实验结果表明，与传统方法相比，STDSG能够更准确地捕捉到中文微文本中的情感趋势，提高了分析的精度和鲁棒性。这对于提升社交媒体情感分析的性能，特别是在处理大量非结构化中文数据时，具有重要的实际应用价值。总结来说，本文的研究深入探讨了如何运用Sense Group的理论和情感词典来分析中文微文本的情感倾向，提供了一种在复杂语言环境中提高情感分析准确性的有效策略，对于推进中文社交媒体情感分析领域的研究和发展具有重要意义。

Research on Sentiment Tendency Analysis of Microtext Based on Sense Group

Bin Gui

School of Information

Remin University of China

Beijing,China

guibin_163@163.com

Xiaoping Yang

School of Information

Remin University of China

City, Country

yang@ruc.edu.cn

Abstract—With the development of internet in China, Mic-

croblogging provides a new platform for communicating and

sharing information among Web users. Users can express

opinions and record daily life using microblogs.Microblogs

that are posted by users indicate their interests to some

extent.But it seems very hard to analyze the sentiment hided in

Chinese Microtext because of its complexity.This paper

proposes a new way to determine the sentiment tendency of

Chinese microtext based on the partitioned Sense

Group(STDSG).When to judge the sentiment tendency of

Microtext,we first partition it into separate sense group,and

then determine it’s sentiment tendency based on Emotional

Dictionary.And We aslo consider various factors which contain

negations, degree adverbs and punctuations. The effectiveness

of STDSG is strongly supported by the results of our

experiments.

Keywords-sense group,Sina Weibo,sentiment tendency,

degree word, negation word

I. INTRODUCTION

In the past few years, there has been a huge growth in the

use of microblogging plat-forms such as Sina Weibo, Twitter.

On a microblogging website, users are able to post short

messages of a certain length, e.g., 140 English or Chinese

characters, to communicate and share information with each

other [1].Web users usually use microb-logs to express

opinions and record daily life. Therefore, the messages

posted by mi-croblog users, to some extent, indicate their

sentiment.

Sentiment or Opinion Mining has been an hot area of

research in academics be-cause of the challenges that it poses.

It is also a vital question that is sought in the industry as it

gives an insight into the consumers' mind, and his decision

making process besides being an explicit feedback about the

performance of any widely used and talked about product,

service, even or a phenomenon. For government,automated

sentiment analysis of microblog posts is of interest to many,

allowing monitoring of public sentiment towards people,

events, as they happen.

While there has been a fair amount of research on how

sentiments are expressed in genres such as online reviews

and news articles, how sentiments are expressed given the

informal language and message-length constraints of

microblogging has been much less studied. Features such as

automatic part-of-speech tags and resources such as

sentiment lexicons have proved useful for sentiment analysis

in Twitter, but will they also prove useful for sentiment

analysis in Chinese microblogging?how to im-prove the

accuracy of sentiment analysis?We will examine these issues.

The rest of the paper is organized as follows. Related

work of sentiment analysis and sentiment analysis in

microblogging are discussed in section 2. Section 3 de-

scribes the algorithm of dividing the sentence into sense

groups.Section 4 illustrates-sentiment tendency analysis on

chinese microblogging. Section 5 describes experi-mental

results. Section 6 concludes.

II. RELATED WORK

Sentiment analysis is one of the hottest topics in data

mining and natural language processing. It also called

Opinion Mining, Opinion Analysis, Sentiment Classification

or Subjectivity Analysis, focuses on how to recognize,

categorize, label and extract the sentiments and viewpoints

hidden in subjective texts[2]. The research of sentiment

analysis fall into three levels: word-level, sentence-level and

passage-level, among which word-level analysis is the

foundation of sentence-level and passage-level. Tur-ney[3]

quantized words’ tendency as a real number measurement,

which is forwardly used to classify the tendency of the whole

passage into “compliment” and “critics” by the way of

machine learning. Hatzivassiloglou[4] attained this goal by

the semantic relation between words. Kamps et al.[5] also

made it with the help of word similarity provided by

WordNet, but with two defects that only adjectives and

synonyms are considered. Du Weifu[6] presented an

extensible tendency calculation framework, regarding the

problem of tendency calculation as optimization problem.

Meena et al.[7] analyzed the sentiment tendency considering

not only single words but also the sentence structures,

grammars, and other semantics information. A hybrid

approach that integrate heuristic rule and Bayesian

classification was adopted by Wang et al. taking adjectives

and adverbs as feature words[8]. Wang Gen et al.[9] applied

condi-tional random field(CRF) into sentence sentiment

analysis, and presented an approach based on redundancy

labeling, while Yang Chao et al.[10] took adverbs in the

sen-tence into consideration to mine the sentiment tendency

in internet comments. Ma-chine learning was brought into

passage-level analysis for the first time by Pang, who made a

comparison between three classification models---NB, ME

and SVM taking n-gram word features into consideration,

and it finally came into the conclusion that unigram feature

appears to have the best effect[11]. However, Cu i’s[12]

experiment showed that unigram only acts well when

performing on small-scale training corpus and it was n-gram

下载后可阅读完整内容，剩余4页未读，立即下载

guibinhello

粉丝: 0
资源: 3

基于感知群组的情感倾向分析：中文微文本的新方法

Sentiment Analysis for Chinese Text Based on Emotion Degree Lexicon and Cognitive Theories

A sentiment analysis parallel algorithm based on MapReduce for network information

Sentiment clustering of product object based on feature reduction

DataMining-And-Social-Sentiment-Analysis-Based-On-Weibo:基于微博的数据挖掘与社交舆情分析

sentiment-analysis-of-reviews

Word-Embedding-and-Sentiment-Analysis-based-stock-prediction

Chinese Micro Blog Sentiment Analysis Based on Semisupervised Learning

Sentiment-Analysis-on-Reviews

Twitter_Sentiment_Analysis:Chetan-SagathiyaTwittet_Sentiment_Analysis

Sentiment-analysis-of-customer-review

最新资源