首页python实现textrank代码

python实现textrank代码

时间: 2023-09-30 21:07:54 浏览: 45

以下是Python实现TextRank的样例代码，供参考： ``` import itertools import networkx as nx import nltk def textrank(sentences): # 创建有向图 graph = nx.DiGraph() # 使用NLTK将句子分为单词 sentences = [nltk.tokenize.word_tokenize(sent) for sent in sentences] # 创建所有的单词组合，共同出现在句子中的单词对被视为相邻节点，并在有向图中添加权重 for sent in sentences: for word in sent: if not graph.has_node(word): graph.add_node(word) for other_word in sent: if word != other_word: if graph.has_edge(word, other_word): # 已有边，增加权重 graph[word][other_word]['weight'] += 1 else: # 新建边，权重为1 graph.add_edge(word, other_word, weight=1) # 使用PageRank算法计算TextRank分数 pr_scores = nx.pagerank(graph) # 返回排序后的句子 return sorted(((score, index) for index, score in pr_scores.items()), reverse=True) # 测试 sentences = [ "TextRank可以用于提取文本摘要和关键词。", "它是一种基于图的模型，用于处理文本中的关系。", "句子可以被视为节点，单词出现在同一句子中的次数被视为边的权重。", "最终，单词被分配一个基于PageRank算法的分数。", ] top_sentences = textrank(sentences)[:2] print([sentences[index] for _, index in top_sentences]) ``` 注意：在使用TextRank时，需要预处理文本（例如，分词、停用词过滤等）以获取更好的结果。此外，此代码只是样例代码，不一定适用于所有情况，需要根据实际需求进行修改和优化。