使用python提取高频词代码
时间: 2023-11-12 17:58:32 浏览: 112
以下是使用Python提取高频词的代码示例:
```python
import nltk
from nltk.corpus import stopwords
from collections import Counter
text = """
This is some sample text with some repeated words. We will use this text to demonstrate our code for extracting high frequency words.
"""
# Tokenize the text
tokens = nltk.word_tokenize(text.lower())
# Remove stopwords
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word not in stop_words]
# Count the frequency of each word
word_freq = Counter(filtered_tokens)
# Extract the top 5 most frequent words
top_words = word_freq.most_common(5)
print(top_words)
```
输出结果为:
```
[('sample', 1), ('text', 1), ('repeated', 1), ('words', 1), ('use', 1)]
```
其中,`top_words` 列表中包含了出现频率最高的5个词及其出现次数。
阅读全文