主题敏感的PageRank：一种上下文相关的网络搜索排名算法

需积分: 16 4 浏览量更新于2024-07-25 收藏 301KB PDF 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

资源详情

资源推荐

Then we construct M

′

as follows:

D = p × d

E = p × [1]

1×n

(4)





n×n

is the uniform distribution

′

= (1 − α)(M + D) + αE

This modiﬁcation improves the quality of PageRank by introducing a decay factor 1 − α

which limits the eﬀect of rank sinks [26], in addition to guaranteeing conver gence to a unique

rank vector. Substituting M

′

for M in Equation 2, we can express PageRank as the solution

to:

Rank = M

′

× Rank (5)

= (1 − α)(M + D) × Rank + αp (6)

with p = [

]

n×1

. The key to creating topic-sensitive PageRank is tha t we c an bias the

computation to increase the e ﬀect of certain categories of pages by using a no nuniform n × 1

personalization vector for p.

To ensure that M

′

is irreducible when p contains any 0 entries,

nodes not reachable from nonzero nodes in p should be removed. This modiﬁcation is no t

implementationally problematic. Note tha t the topic-based inﬂuencing involves introducing

additional rank to the appropriate nodes in each iteration of the computation – it is not

simply a postpr ocessing step performed on the standard PageRank vector.

In terms of the random-walk model, the personalization vector represents the addition of

a complete set of transition edg e s where the pr obability on an artiﬁcial edge (u , v) is given by

αp

. We will denote the solution Rank

∗

of Equation 6, with α = α

∗

and a particular p = p

∗

as P R(α

∗

, p

∗

). By appropriately selecting p, the rank vector can be made to prefer certain

categories of pages. The bias factor α spe c iﬁes the degree to which the computation is biased

towards p.

3 Topic-Sensitive PageRank

In our approach to topic-sensitive PageRank, we pr ecompute the importance scores oﬄine,

as with ordinar y PageRank. However, we compute multiple importance scores for each page;

we compute a set of score s of the importance of a page with respect to various topics. At

query time, these importance scores are combined based on the topics of the query to form

a composite PageRank score for those pages matching the query. This score can be use d in

conjunction with other IR-ba sed scoring schemes to produce a ﬁnal rank for the result pages

with respect to the query. As the scor ing functions of commercial search engines are not

known, in our work we do not consider the eﬀect of these IR scor e s (other than requiring

that the query terms appear in the page).

We believe that the improvements to PageRank’s

precision will transla te into improvements in overall search ra nk ings, even after other IR-based

scores are factored in. Note that the topic-sensitive PageRank score itself implicitly makes

use of IR in determining the topic of the query. However this use of IR is not vulnerable to

manipulation of pages by adversarial webmaster s seeking to raise the sc ore of their sites.

Equation 6 makes use of the fact that k

Rank

= 1.

Page et al.[26] originally suggest setting

directly u sing the bookmarks of the user, although that

approach is not practical for large numbers of users.

For instance, most search engines use term weighting schemes which make special use of HTML

tags.

剩余21页未读，继续阅读

gnabai

粉丝: 0
资源: 1

主题敏感的PageRank：一种上下文相关的网络搜索排名算法

Topic-sensitive PageRank - a context-sensitive ranking algorithm

mysql 所有关键字

system_sensitive_word表sql文件

case-sensitive-paths-webpack-plugin

过滤敏感词（要求：将输入语句中的敏感词语使用*替换。敏感词库集合 sensitive_words = {'敏感1','敏感2','敏感3','敏感4','敏感5','敏感6' }）python

Cost-Sensitive Face Recognition

过滤敏感词（要求：将输入语句中的敏感词语使用*替换。敏感词库集合 sensitive_words = {'敏感1','敏感2','敏感3','敏感4','敏感5','敏感6' }）用

context-free grammar 和 context-sensitive有什么区别

python过滤敏感词（要求：将输入语句中的敏感词语使用*替换。敏感词库集合 sensitive_words = {'敏感1','敏感2','敏感3','敏感4','敏感5','敏感6' }）

请给出Cost-Sensitive SVM图像的MATLAB代码

PageRank算法分支

Locality-sensitive hashing（LSH）的Python代码

中国石油大学(华东)在北京2021-2024各专业最低录取分数及位次表.pdf

最新资源