首页将txt文档分为三部分，分别进行词频统计

将txt文档分为三部分，分别进行词频统计

时间: 2023-08-19 16:04:57 浏览: 33

好的，我可以为您提供一个Python的代码示例，来将一个txt文档分为三部分，并进行词频统计。 ```python # 导入必要的库 import re from collections import Counter # 读取文本文件 with open('example.txt', 'r', encoding='utf-8') as f: text = f.read() # 将文本分为三部分 parts = re.split(r'PART\s\d+', text) # 定义一个函数来进行词频统计 def count_words(text): # 将文本中的标点符号、数字和换行符替换为空格 text = re.sub(r'[^\w\s]|\d', ' ', text) # 将文本转换为小写 text = text.lower() # 将文本分割成单词列表 words = text.split() # 统计单词出现的次数 word_counts = Counter(words) # 返回单词计数结果 return word_counts # 对每个部分进行词频统计 for i, part in enumerate(parts): word_counts = count_words(part) print(f'Part {i+1} word counts:', word_counts.most_common(10)) ``` 在这个示例中，我们首先使用Python的`re`模块将文本文件分为三个部分。然后，我们定义一个名为`count_words`的函数，该函数将文本作为输入，并返回每个单词在文本中出现的次数。最后，我们循环遍历每个部分，并使用`count_words`函数进行词频统计。对于每个部分，我们输出前10个最常见的单词及其出现次数。