词频统计 python
时间: 2023-11-04 08:57:16 浏览: 120
Python之词频统计
词频统计是指统计一段文本中各个单词出现的频率。在Python中,有多种方法可以实现词频统计。其中一种方法是使用集合去重和列表的count方法来统计频数,然后将结果按照出现次数进行排序。另一种方法是使用字典来统计单词出现的次数,然后根据次数进行排序。还有一种方法是使用计数器来实现词频统计。
方法一:
```python
def word_count1(words, n):
word_list = []
for word in set(words):
num = words.count(word)
word_list.append([word, num])
word_list.sort(key=lambda x: x[1], reverse=True)
for i in range(n):
word, count = word_list[i]
print("{0:<15}{1:>5}".format(word, count))
```
方法二:
```python
def word_count2(words, n):
counts = {}
for word in words:
if len(word) == 1:
continue
else:
counts[word] = counts.get(word, 0) + 1
items = list(counts.items())
items.sort(key=lambda x: x[1], reverse=True)
for i in range(n):
word, count = items[i]
print("{0:<15}{1:>5}".format(word, count))
```
方法三:
```python
def word_count3(words, n):
from collections import Counter
counts = Counter(words)
for ch in "":
del counts[ch]
for word, count in counts.most_common(n):
print("{0:<15}{1:>5}".format(word, count))
```
阅读全文