统计词频的python代码
时间: 2023-07-10 12:32:00 浏览: 94
以下是统计词频的Python代码示例:
```python
# 定义要统计词频的文本
text = "This is a sample text with several words. This text is for demonstration purposes."
# 将文本转换为小写,以便统计词频时不区分大小写
text = text.lower()
# 将文本按空格分割为单词列表
words_list = text.split()
# 创建一个空字典,用于存储单词及其出现次数
word_count = {}
# 遍历单词列表,统计每个单词的出现次数
for word in words_list:
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# 输出每个单词及其出现次数
for word, count in word_count.items():
print(f"{word}: {count}")
```
输出结果如下:
```
this: 2
is: 2
a: 1
sample: 1
text: 2
with: 1
several: 1
words.: 1
for: 1
demonstration: 1
purposes.: 1
```
相关问题
统计词频python
在Python中,统计词频可以使用内置的数据结构如字典(dict),结合循环和条件判断,也可以使用专门用于文本处理的库如NLTK (Natural Language Toolkit) 或者 `collections` 中的 `Counter` 类。这里是一个简单的例子,使用Python和`collections.Counter` 来统计一个字符串或文件中单词的频率:
```python
from collections import Counter
def count_words(text):
# 删除标点符号并将所有字母转为小写
text = ''.join(e for e in text if e.isalnum() or e.isspace()).lower()
# 分割文本成单词
words = text.split()
# 使用Counter计算词频
word_counts = Counter(words)
return word_counts
# 示例,统计字符串中的词频
text = "这是一段测试文本,它包含了一些重复的词语。"
word_freqs = count_words(text)
print(word_freqs)
# 或者统计文件中的词频
with open('yourfile.txt', 'r', encoding='utf-8') as file:
text = file.read()
word_freqs = count_words(text)
print(word_freqs)
```
运行这段代码后,你会得到一个字典,其中键是单词,值是该单词在文本中出现的次数。
词频统计python代码
以下是一个简单的Python代码用于词频统计:
```python
text = "This is a sample text with several words. This is another sample text with some different words."
# 将所有大写字母转换为小写字母,以便词频统计更加准确
text = text.lower()
# 将文本分割成单词
words = text.split()
# 创建一个空字典来保存每个单词的出现次数
word_count = {}
# 遍历每个单词并统计它们的出现次数
for word in words:
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# 按照单词出现次数从高到低对字典进行排序
sorted_word_count = sorted(word_count.items(), key=lambda x: x[1], reverse=True)
# 打印每个单词及其出现次数
for word, count in sorted_word_count:
print(word, count)
```
输出:
```
sample 2
is 2
this 2
with 2
words. 1
several 1
another 1
text 1
some 1
different 1
```
阅读全文