有一段英文:text,请统计出该段英文有多少个单词,如果不算of,a,the,给出出现频率最高的三个单词,并给出她们出现的次数,python语言
时间: 2024-11-27 10:12:15 浏览: 3
信号与通信系统课件:第10讲-基带传输原理.pdf
要完成这个任务,你可以使用Python的`nltk`库进行文本处理,包括分词和计数。以下是一个简单的步骤:
```python
import nltk
from collections import Counter
# 假设text变量存储了英文文本
def word_analysis(text):
# 删除常见停用词
stop_words = set(nltk.corpus.stopwords.words('english'))
text = [word.lower() for word in nltk.word_tokenize(text) if word not in stop_words and word.isalpha()]
# 统计单词频率
word_counts = Counter(text)
# 排序并获取最常见的三个单词及其频率
top_three_words = word_counts.most_common(3)
return len(text), top_three_words
# 示例文本
example_text = "This is a sample text with some words of interest. The common words like 'the', 'a', and 'is' will be removed."
length, top_three = word_analysis(example_text)
print("总单词数:", length)
print("出现频率最高的三个单词:")
for word, count in top_three:
print(f"{word}: {count}次")
```
运行上述代码后,你会得到文本的单词总数以及出现频率最高的三个非停用词。
阅读全文