微搏链接预测：综合特性和改进SVM算法的应用

研究论文

71 浏览量更新于2024-08-26 1 收藏 466KB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

本文主要探讨了在新浪微博这个近年来日益流行的社交媒体平台上进行链接预测的研究。链接预测是社交网络分析中的一个重要课题，然而，在新浪微博这样的大规模社交网络中，尽管传统社交网络的研究方向众多，但这一问题尚未得到充分关注。作者 Yun Li、Kai Niu 和 Baoyu Tian 来自北京邮电大学的通用无线通信重点实验室，他们针对新浪微博的特点，提出了一个有效且全面的特征集，旨在解决链接预测问题。首先，作者认识到微博数据的特性，包括用户的社交行为、文本内容、时间序列以及地理位置等多元信息，这些都可能影响用户之间的连接。因此，他们的特征集考虑了这些综合性因素，以便更准确地捕捉潜在的联系。这与传统的社交网络分析方法相比，更加注重个体的多元化行为模式。接着，他们采用了改进的快速分类算法——多项式核支持向量机（Fast Classificational Algorithm for Polynomial Kernel Support Vector Machines, FCPKSVM）。相较于普通的SVM，这种算法在保持高预测性能的同时，通过将大部分计算工作移至训练阶段，显著降低了预测阶段的时间复杂度。这种方法对于处理大规模数据集具有实际意义，因为它减少了在线预测时的计算负担。作者通过实验验证了他们提出的特征集和改进算法的有效性。他们展示了使用这些策略训练的机器学习模型在预测新浪微博用户之间未连接关系时的高准确性和效率。这项研究不仅有助于提高链接预测的精度，也为其他研究人员提供了在社交媒体平台如微博上进行类似任务的实用工具和参考框架。这篇研究论文深入探讨了如何结合微博的独特特性与优化的机器学习算法，有效地预测用户间的潜在链接，这对于理解用户行为、推荐系统以及社交网络演化具有重要的理论和实际价值。

资源详情

资源推荐

Proceedings of CCIS2014

Link Prediction in Sina Microblog using Comprehensive

Features and Improved SVM Algorithm

Yun Li, Kai Niu, Baoyu Tian

Key Laboratory of Universal Wireless Communications,

Beijing University of Posts and Telecommunications,

Beijing, 100876, P. R. China

liyun_bupt@126.com

Abstract: Sina Microblog has become one of the most

popular social networks in recent years. As a result,

many interdisciplinary research directions of traditional

social network have been conducted to it. But the link

prediction problem in Sina Microblog has not drawn

much attention till now. In this paper, we conduct a

research of link prediction in Sina Microblog. According

to the characteristics of Sina Microblog, we propose an

effective and comprehensive feature set for link

prediction in Sina Microblog. Then we apply fast

classification algorithm for polynomial kernel support

vector machines (FCPKSVM) to train our classifier and

by transferring most of calculation from prediction

phase to training phase, time complexity in prediction

phase is greatly reduced. We show that a machine

learning classifier trained using the proposed feature set

can obtain comparable and good prediction performance

for link prediction in Sina Microblog, and by

introducing FCPKSVM, our method achieves far less

time complexity in prediction phase compared with

other classical classifiers.

Keywords: link prediction, Sina Microblog,

FCPKSVM

1 Introduction

Sina Microblog is the earliest and biggest microblogging

service in China having over 100 million active users

and has attracted more and more attention in recent

years. Much work has been done to examine the

structural and behavioral properties of Sina Microblog,

but few efforts have been made to solve the link

prediction problem in Sina Microblog.

The work of M. A. Hasan and M. J. Zaki [1] shows the

link prediction problem is dominated by topological

studies of the graphs used to represent social networks.

To obtain the graph of a social network, a user is

represented by a node and the social relationship

between two users is represented by a link. These graphs

change over time with users’ interaction in social

networks. Understanding the dynamics of these graphs

can start with the analysis of how the association

between arbitrary two nodes evolves, namely the link

prediction problem.

Early research about link prediction was mainly made by

computer science. R. R. Sarukkai [2] tried to use

Markov chains to conduct link prediction and path

analysis in computer networks. Then J. Zhu et al. [3]

applied link prediction based on Markov chains to

adaptive web sites. For its unsatisfying prediction

performance, the method using Markov chains didn’t get

widely used in link prediction of social network.

Then for fulfilling the requirement of practical

applications, research of link prediction spreads to

various domains including social network analysis. Link

prediction was used to find interaction between proteins

in bioinformatics [4], to help build recommendation

systems in e-commerce [5] and to identify hidden links

in social network [6].

Supervised machine learning is an effective method to

solve the link prediction problem. This method was first

used by Liben-Nowell and Kleinberg [7], then it was

extended constantly [6, 8, 9, 10, 11] and achieved very

good prediction performance on most of the datasets.

There are many dimensions of features to describe link

between two users [6], but most of the work using

supervised learning only used topological features for

they can equally apply to all domains, and commonly

had high time complexity in prediction phase, which

restricted their real-time application in huge social

networks.

In this paper, we propose an effective solution to the link

prediction problem in Sina Microblog based on

FCPKSVM trained on a feature set that not only

considers topological features, but also includes features

extracted from the microblogs users have issued and

users’ attributes. The proposed method is compared with

several classical classification algorithms. In addition,

we show the different effectiveness of the features in our

feature set using information gain attribute.

The rest of the paper is organized as follows: In Section

2 we give a detailed description of Sina Microblog and

the dataset we collected. Section 3 introduces each

feature we adopted in our feature set. Section 4 presents

our experiment setup, the effectiveness and efficiency

performance of our selected algorithms and the

information gain value of each feature to show the

different contribution of them. Section 5 gives our

conclusion.

2 Dataset of Sina Microblog

下载后可阅读完整内容，剩余4页未读，立即下载

weixin_38738422

粉丝: 3
资源: 922

微搏链接预测：综合特性和改进SVM算法的应用

基于SVM算法的航天微博情感分析.pdf

代码 基于SVM支持向量机算法的降水量预测模型代码

探索SVM算法在分类问题中的优势

SVM算法在异常检测中的实际应用

SVM算法在文本分类任务中的效果分析

视频中的人脸那么多，怎么利用改进的lbp svm算法进行人脸识别

svm算法改进代码matlab

公司证信预测用svm算法

svm算法可以怎么改进

利用sklearn中的方法实现SVM算法中为什么要分离特征和标签

svm微博评论情感分类

jupyter向量机svm算法预测股票代码

编写一个使用SVM算法进行数据预测的spark分类算法

svm时序预测算法原理

svm算法在求解过程中可能出现的问题或缺点

如何通过svm对微博评论进行分类

基于word2vec和svm模型的微博中文评论情感分析

微博 svm情感分析 带数据集

python pso svm 算法

利用sklearn中的方法实现SVM算法

最新资源

代码基于SVM支持向量机算法的降水量预测模型代码

微博 svm情感分析带数据集