首页我需要一个python程序代码，我希望他能实现的功能有基于我文件里的txt文本内容进行词频统计并且输出改词和出现次数其次我的txt文件是中文的可能需要处理一下。帮我输出一下完整代码

我需要一个python程序代码，我希望他能实现的功能有基于我文件里的txt文本内容进行词频统计并且输出改词和出现次数其次我的txt文件是中文的可能需要处理一下。帮我输出一下完整代码

时间: 2024-10-29 16:24:50 浏览: 23

当然可以，以下是一个简单的Python程序，使用`jieba`库处理中文分词，并使用`collections`库进行词频统计： ```python # 首先安装必要的库（如果你还没安装） !pip install jieba collections import jieba from collections import Counter def word_frequency_in_text(file_path): # 加载中文停用词表（如果有的话，可以提高精度） with open('stopwords.txt', 'r', encoding='utf-8') as f: stopwords = set(f.read().splitlines()) # 读取txt文件 with open(file_path, 'r', encoding='utf-8') as f: text = f.read() # 使用jieba进行中文分词 words = [word for word in jieba.lcut(text) if word not in stopwords] # 使用Counter计算词频 word_counts = Counter(words) # 输出结果 for word, count in word_counts.most_common(): print(f'{word}: {count}') # 替换下面的'data.txt'为你的实际文件路径 file_path = 'data.txt' word_frequency_in_text(file_path) ``` 在这个代码中，你需要准备一个名为'stopwords.txt'的停用词列表文件（如果没有就不用）。将上述代码保存到一个.py文件中，然后运行它，它会输出文本中出现频率最高的词语及其出现次数。

阅读全文