首页影评词云图python代码

影评词云图python代码

时间: 2023-06-17 07:06:18 浏览: 60

下面是一个简单的Python代码，用于生成影评词云图: ```python import jieba from wordcloud import WordCloud import matplotlib.pyplot as plt from PIL import Image import numpy as np # 读取文本文件 with open('movie_reviews.txt', 'r', encoding='utf-8') as f: text = f.read() # 使用jieba进行分词 words = jieba.cut(text) # 去除停用词 stopwords = set() with open('stopwords.txt', 'r', encoding='utf-8') as f: for line in f: stopwords.add(line.strip()) filtered_words = [] for word in words: if word not in stopwords: filtered_words.append(word) # 统计词频 word_counts = {} for word in filtered_words: if word not in word_counts: word_counts[word] = 1 else: word_counts[word] += 1 # 生成词云图 mask = np.array(Image.open('mask.png')) wc = WordCloud(background_color='white', mask=mask, font_path='msyh.ttc', max_words=500, max_font_size=100, width=800, height=800) wc.generate_from_frequencies(word_counts) plt.imshow(wc, interpolation='bilinear') plt.axis('off') plt.show() ``` 其中，`movie_reviews.txt` 是包含影评文本的文件，`stopwords.txt` 是包含停用词的文件，`mask.png` 是词云图的形状，`msyh.ttc` 是字体文件。需要注意的是，这个代码中使用的是中文的分词和停用词，如果需要处理其他语言的文本，则需要相应地修改。