自然语言信息处理答案抽取python
时间: 2023-06-14 13:05:31 浏览: 126
在Python中,可以使用自然语言处理(NLP)库来实现答案抽取。以下是一个简单的答案抽取示例:
1. 导入必要的库:
```python
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
```
2. 准备文本:
```python
text = "The quick brown fox jumped over the lazy dog. The dog slept all day."
```
3. 分句和分词:
```python
sentences = sent_tokenize(text)
words = word_tokenize(text)
```
4. 去除停用词:
```python
stop_words = set(stopwords.words('english'))
filtered_words = [word for word in words if word.casefold() not in stop_words]
```
5. 找到问题中的关键词:
```python
question = "What did the dog do all day?"
question_words = word_tokenize(question)
keywords = [word for word in question_words if word.casefold() not in stop_words]
```
6. 对于每个句子,找到包含关键词的词组:
```python
for sentence in sentences:
sentence_words = word_tokenize(sentence)
for i in range(len(sentence_words)):
if sentence_words[i].casefold() == keywords[0]:
answer = " ".join(sentence_words[i:i+len(keywords)])
print(answer)
```
在上述代码中,我们首先将文本分成句子和单词。然后,我们去除停用词,找到问题中的关键词,并使用这些关键词在每个句子中找到包含它们的词组。最后,我们将找到的答案打印出来。
需要注意的是,这只是一个简单的答案抽取示例,实际应用中需要更加复杂的算法和技术来实现更精确的答案抽取。
阅读全文