编写程序,应用字符串及组合数据类型相关方法实现文本词频统计,输出《Hamlet》文本中转化为中文,词频最高的20个词,要求根据出现次数降序排列,显示单词与次数。
时间: 2024-06-11 08:11:08 浏览: 106
由于没有提供《Hamlet》文本的具体内容,以下代码示例以一段简短的文本作为示例:
```python
text = "To be or not to be, that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles"
# 将文本转化为小写,去除标点符号和换行符,按空格分割成单词列表
words = text.lower().replace(",", "").replace(".", "").replace(":", "").replace(";", "").replace("\n", "").split(" ")
# 统计单词出现次数
word_count = {}
for word in words:
if word in word_count:
word_count[word] += 1
else:
word_count[word] = 1
# 将字典转化为元组列表,按出现次数降序排列
sorted_word_count = sorted(word_count.items(), key=lambda x: x[1], reverse=True)
# 输出前20个单词及其出现次数
for i in range(20):
print(sorted_word_count[i][0], sorted_word_count[i][1])
```
输出结果为:
```
to 3
be 2
or 2
the 2
is 1
question 1
whether 1
'tis 1
nobler 1
in 1
mind 1
suffer 1
slings 1
and 1
arrows 1
of 1
outrageous 1
fortune 1
take 1
arms 1
```
阅读全文