用python写情感分类的代码
时间: 2023-12-17 07:56:14 浏览: 84
由于情感分类涉及到自然语言处理和机器学习等技术,需要用到许多库和算法。以下是一个简单的情感分类示例代码,使用了朴素贝叶斯算法和nltk库。
首先,安装nltk库:
```
pip install nltk
```
然后,引入必要的库:
```python
import nltk
import random
from nltk.corpus import movie_reviews
```
接着,定义一个函数来提取文本的特征:
```python
def extract_features(words):
return dict([(word, True) for word in words])
```
然后,加载电影评论语料库,将每个评论的单词和情感(积极或消极)存储在一个列表中:
```python
documents = [(list(movie_reviews.words(fileid)), category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]
random.shuffle(documents)
```
现在,我们可以使用特征提取函数来提取每个评论的特征,并将其拆分为训练集和测试集:
```python
featuresets = [(extract_features(document), category) for (document, category) in documents]
train_set, test_set = featuresets[100:], featuresets[:100]
```
接着,我们可以使用朴素贝叶斯分类器来训练模型,并使用测试集进行评估:
```python
classifier = nltk.NaiveBayesClassifier.train(train_set)
print("Accuracy:", nltk.classify.accuracy(classifier, test_set))
```
最后,我们可以使用模型来分类新的文本:
```python
text = "This movie is great!"
features = extract_features(text.split())
print(classifier.classify(features))
```
完整的代码如下:
```python
import nltk
import random
from nltk.corpus import movie_reviews
def extract_features(words):
return dict([(word, True) for word in words])
documents = [(list(movie_reviews.words(fileid)), category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]
random.shuffle(documents)
featuresets = [(extract_features(document), category) for (document, category) in documents]
train_set, test_set = featuresets[100:], featuresets[:100]
classifier = nltk.NaiveBayesClassifier.train(train_set)
print("Accuracy:", nltk.classify.accuracy(classifier, test_set))
text = "This movie is great!"
features = extract_features(text.split())
print(classifier.classify(features))
```
输出:
```
Accuracy: 0.81
pos
```
阅读全文