首页请编写一段python程序，题目为：2、自己选择语料，构建一个词云，并进行简要的分析

请编写一段python程序，题目为：2、自己选择语料，构建一个词云，并进行简要的分析

时间: 2024-02-18 09:05:43 浏览: 75

好的，以下是一个简单的Python程序，用于构建一个词云并进行简要的分析： ```python # 导入所需的库 import jieba from wordcloud import WordCloud import matplotlib.pyplot as plt # 读取文本文件，进行分词 with open('my_text.txt', 'r', encoding='utf-8') as f: text = f.read() words = jieba.cut(text) # 去除停用词 stop_words = ['的', '了', '是', '我', '你', '他', '她', '它', '我们', '你们', '他们', '她们', '它们'] words = [word for word in words if word not in stop_words] # 将分词结果拼接成字符串 words_str = ' '.join(words) # 构建词云 wc = WordCloud(font_path='msyh.ttc', width=800, height=600, background_color='white').generate(words_str) # 显示词云 plt.imshow(wc) plt.axis('off') plt.show() # 统计词频 word_count = {} for word in words: if word in word_count: word_count[word] += 1 else: word_count[word] = 1 # 输出前10个出现频率最高的词 sorted_word_count = sorted(word_count.items(), key=lambda x: x[1], reverse=True) for i in range(10): print(sorted_word_count[i]) ``` 以上代码中，我们首先读取一个文本文件，并使用jieba库对文本进行分词。然后，我们去除了一些停用词，这些停用词不具有实际意义，如“的”、“了”等。接着，我们将分词结果拼接成一个字符串，并使用WordCloud库构建了一个词云。最后，我们统计了每个词出现的频率，并输出了出现频率最高的前10个词。你只需要将代码中的`my_text.txt`替换成你自己的文本文件即可。同时，也可以自定义停用词表，或者对分词结果进行其他的处理。

阅读全文