用·python统计三国演义(下卷)前十的高频词,含出现次数
时间: 2023-07-01 22:06:39 浏览: 88
Python统计单词出现的次数
5星 · 资源好评率100%
实现这个需求,可以使用Python中的jieba库来进行中文分词,然后使用Python中的collections库中的Counter类来进行词频统计。具体步骤如下:
1. 安装jieba库:
```
!pip install jieba
```
2. 导入所需的库:
```python
import jieba
from collections import Counter
```
3. 加载文本数据并进行分词:
```python
with open('三国演义(下卷).txt', encoding='utf-8') as f:
text = f.read()
words = jieba.lcut(text)
```
4. 使用Counter类进行词频统计:
```python
word_counts = Counter(words)
```
5. 获取前十高频词:
```python
top_words = word_counts.most_common(10)
```
6. 输出结果:
```python
for word, count in top_words:
print(f'{word}: {count}')
```
完整代码如下:
```python
import jieba
from collections import Counter
with open('三国演义(下卷).txt', encoding='utf-8') as f:
text = f.read()
words = jieba.lcut(text)
word_counts = Counter(words)
top_words = word_counts.most_common(10)
for word, count in top_words:
print(f'{word}: {count}')
```
阅读全文