改正上面的python代码
时间: 2023-07-23 21:44:42 浏览: 126
这段代码存在以下问题:
1. 变量 `text` 未定义,无法运行代码。
2. 变量 `word_textrank` 没有被赋值,导致在后面的代码中无法打印关键词。
3. 在计算每个词语的TextRank值时,代码中的 `word = textrank[word] = textrank[i,0]` 似乎有误,应该改为 `word_textrank[word] = textrank[i,0]`。
下面是修改后的代码:
```python
def TextRank(text):
window = 3
win_dict = {}
filter_word = Filter_word(text)
length = len(filter_word)
# 构建每个节点的窗口集合
for word in filter_word:
index = filter_word.index(word)
# 设置窗口左、右边界,控制边界范围
if word not in win_dict:
left = index - window + 1
right = index + window
if left < 0:
left = 0
if right >= length:
right = length
words = set()
for i in range(left, right):
if i == index:
continue
words.add(filter_word[i])
win_dict[word] = words
# 构建相连的边的关系矩阵
word_dict = list(set(filter_word))
lengths = len(set(filter_word))
matrix = pd.DataFrame(np.zeros([lengths, lengths]))
for word in win_dict:
for value in win_dict[word]:
index1 = word_dict.index(word)
index2 = word_dict.index(value)
matrix.iloc[index1, index2] = 1
matrix.iloc[index2, index1] = 1
summ = 0
cols = matrix.shape[1]
rows = matrix.shape[0]
# 归一化矩阵
for j in range(cols):
for i in range(rows):
summ += matrix.iloc[i, j]
matrix[j] /= summ
# 根据公式计算textrank值
d = 0.85
iter_num = 700
word_textrank = {}
textrank = np.ones([lengths, 1])
for i in range(iter_num):
textrank = (1 - d) + d * np.dot(matrix, textrank)
# 将词语和textrank值一一对应
for i in range(len(textrank)):
word = word_dict[i]
word_textrank[word] = textrank[i, 0]
keyword = 6
print('---------------------')
print('textrank 模型结果:')
for key, value in sorted(word_textrank.items(), key = operator.itemgetter(1),reverse = True)[:keyword]:
print(key + '/', end='')
```
注意,这里将 `d` 的值改为了 `0.85`,这是 TextRank 算法中的一个经验参数。
阅读全文