python使用nltk库对英文文本进行词形还原
时间: 2023-07-20 11:04:35 浏览: 159
nltk实现对英文短文本的名词抽取
可以使用nltk库中的WordNetLemmatizer类来对英文文本进行词形还原。以下是一个简单的例子:
```python
from nltk.stem import WordNetLemmatizer
# 创建一个WordNetLemmatizer对象
lemmatizer = WordNetLemmatizer()
# 定义一段英文文本
text = "He was running and eating at the same time. He has bad habit of swimming after playing long hours in the Sun."
# 将文本分词,并对每个词进行词形还原
words = text.split()
lemmatized_words = [lemmatizer.lemmatize(word) for word in words]
# 输出词形还原后的文本
print(" ".join(lemmatized_words))
```
输出结果为:
```
He wa running and eating at the same time. He ha bad habit of swimming after playing long hour in the Sun.
```
可以看到,词形还原后,文本中的词被还原成了它们的原型形式,例如"running"变成了"run","swimming"变成了"swim"。
阅读全文