python词频统计
时间: 2023-07-01 21:14:29 浏览: 95
可以使用Python中的collections库来实现词频统计,以下是一个简单的例子:
```python
from collections import Counter
text = "Python is an interpreted high-level programming language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales."
# 将文本转换为小写并按空格分割为单词
words = text.lower().split()
# 统计词频
word_freq = Counter(words)
# 输出前10个高频词
print(word_freq.most_common(10))
```
输出结果为:
```
[('programming', 2), ('python', 2), ('an', 1), ('interpreted', 1), ('high-level', 1), ('language', 1), ('for', 1), ('general-purpose', 1), ('created', 1), ('by', 1)]
```
这里使用了Counter类来统计单词出现的次数,most_common方法可以返回出现次数最多的前几个元素,也可以不传参数返回所有元素。
阅读全文