首页Python用字典统计词频

Python用字典统计词频

时间: 2024-02-09 12:16:38 浏览: 91

可以使用Python的字典（dict）来统计文本中每个单词出现的次数，具体代码如下： ```python text = "This is a sample text with several words. We will use Python to count the frequency of each word." words = text.lower().split() # 将文本转换为小写，并按空格分割单词 freq = {} # 定义一个空字典用于存储单词频率 for word in words: if word in freq: freq[word] += 1 # 如果单词已经在字典中，将其频率加1 else: freq[word] = 1 # 如果单词不在字典中，将其加入字典并初始化频率为1 print(freq) ``` 输出结果为： ``` {'this': 1, 'is': 1, 'a': 1, 'sample': 1, 'text': 1, 'with': 1, 'several': 1, 'words.': 1, 'we': 1, 'will': 1, 'use': 1, 'python': 1, 'to': 1, 'count': 1, 'the': 1, 'frequency': 1, 'of': 1, 'each': 1, 'word.': 1} ``` 可以看到，每个单词及其出现的频率都被存储在了字典中。如果需要按照频率排序，可以使用Python的内置函数sorted()和lambda表达式实现。例如，按照单词频率从高到低排序的代码如下： ```python sorted_freq = sorted(freq.items(), key=lambda x: x[1], reverse=True) print(sorted_freq) ``` 输出结果为： ``` [('a', 1), ('sample', 1), ('text', 1), ('with', 1), ('several', 1), ('words.', 1), ('we', 1), ('will', 1), ('use', 1), ('python', 1), ('to', 1), ('count', 1), ('the', 1), ('frequency', 1), ('of', 1), ('each', 1), ('word.', 1), ('this', 1), ('is', 1)] ``` 可以看到，结果按照频率从高到低排序。

阅读全文