Python词频统计宋词三百首
时间: 2023-11-29 12:45:21 浏览: 148
python获取的宋词三百首数据库
5星 · 资源好评率100%
以下是Python词频统计宋词三百首的例子:
```python
# 读取文件
with open('song.txt', 'r', encoding='utf-8') as f:
text = f.read()
# 分词
import jieba
words = jieba.lcut(text)
# 去除停用词
stopwords = [',', '。', '?', '!', '、', ':', ';', '「', '」', '『', '』', '(', ')', '—', '…', '·', '《', '》', '\n']
words_clean = []
for word in words:
if word not in stopwords:
words_clean.append(word)
# 统计词频
from collections import Counter
word_counts = Counter(words_clean)
# 获取前10个高频词
top_10 = word_counts.most_common(10)
print(top_10)
```
阅读全文