首页python使用nltk库对英文文本进行词形还原

python使用nltk库对英文文本进行词形还原

时间: 2023-07-20 11:04:35 浏览: 159

nltk实现对英文短文本的名词抽取

基于nltk实现对英文短文本的名词抽取，规则可以自己制定。

可以使用nltk库中的WordNetLemmatizer类来对英文文本进行词形还原。以下是一个简单的例子： ```python from nltk.stem import WordNetLemmatizer # 创建一个WordNetLemmatizer对象 lemmatizer = WordNetLemmatizer() # 定义一段英文文本 text = "He was running and eating at the same time. He has bad habit of swimming after playing long hours in the Sun." # 将文本分词，并对每个词进行词形还原 words = text.split() lemmatized_words = [lemmatizer.lemmatize(word) for word in words] # 输出词形还原后的文本 print(" ".join(lemmatized_words)) ``` 输出结果为： ``` He wa running and eating at the same time. He ha bad habit of swimming after playing long hour in the Sun. ``` 可以看到，词形还原后，文本中的词被还原成了它们的原型形式，例如"running"变成了"run"，"swimming"变成了"swim"。

阅读全文