首页用python实现txt文件词频统计并用可视化图形展示

用python实现txt文件词频统计并用可视化图形展示

时间: 2024-02-12 11:06:17 浏览: 86

好的，我可以为您提供代码实现。首先，我需要确认您是否已经安装了必要的 Python 库，如 `matplotlib` 和 `nltk`。如果您还没有安装，可以通过以下命令安装： ```python !pip install matplotlib nltk ``` 接下来，您可以按照以下步骤进行操作： 1. 导入必要的库 ```python import matplotlib.pyplot as plt from nltk.tokenize import word_tokenize from nltk.probability import FreqDist ``` 2. 读取文本文件并进行分词处理 ```python with open('your_file.txt', 'r') as file: text = file.read() tokens = word_tokenize(text) ``` 3. 统计词频并获取前 N 个高频词 ```python fdist = FreqDist(tokens) n = 10 # 获取前 10 个高频词 top_words = dict(fdist.most_common(n)) ``` 4. 生成可视化图形 ```python plt.bar(top_words.keys(), top_words.values()) plt.title('Top {} Words in Text'.format(n)) plt.xlabel('Words') plt.ylabel('Frequency') plt.show() ``` 这样，您就可以得到一个柱状图，展示了文本中出现频率最高的前 N 个词语及其出现次数。请将代码中的 `your_file.txt` 替换为您要处理的文本文件路径，并将 `n` 替换为您要获取的高频词数量。

阅读全文