
第 39 卷第 9 期 电 子 与 信 息 学 报 Vol.39No.9
2017 年 9 月 Journal of Electronics & Information Technology Sept. 2017
基于 NSGA2 的网络环境下多标签种子节点选择
李 磊
*①
楚喻棋
①
汪 萌
①
韩 莉
②
吴信东
①③
①
(合肥工业大学计算机与信息学院 合肥 230009)
②
(科学技术部基础研究管理中心 北京 100862)
③
(路易斯安那州立大学计算机与信息学院 拉斐特 70503 美国)
摘 要:随着社交网络规模的不断扩大,网络节点的标签分类也不再单一,变得丰富多样,这些促使了社交网络中
的多标签分类问题成为一个重要的研究领域。以前的研究重点主要集中在提高预测网络节点标签的精度上,而忽略
了得到节点信息所产生的包含时间消耗和计算资源等在内的系统开销问题。可现如今随着网络规模不断扩大且复杂
性不断增强,之前所忽略的系统开销问题变得越来越严重,增加了预测标签的成本,加重了预测网络节点标签的难
度。该文针对这一问题提出了基于 NSGA2 算法的网络环境下多标签种子节点选择算法(NAMESEA 算法),目的是
在能大大降低预测节点标签所消耗的系统开销的前提下一定程度上提高预测标签的精度。该文将 NAMESEA 算法
与其他多标签预测算法在多个真实数据集上进行实验对比,结果证明 NAMESEA 算法大大降低了预测节点标签的
系统开销并且提高了预测精度。
关键词:社交网络;多标签分类;NSGA2;系统开销
中图分类号: TP393 文献标识码: A 文章编号:1009-5896(2017)09-2040-08
DOI: 10.11999/JEIT161266
NSGA2-based Multi-label Seed Node Selection in Network Environments
LI Lei
①
CHU Yuqi
①
WANG Meng
①
HAN Li
②
WU Xindong
①③
①
(School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China)
②
(Basic Research Management Center, Ministry of Science and Technology, Beijing 100862, China)
③
(School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette 70503, USA)
Abstract: With the expanding scale of social networks, the label classification of nodes in the network is no longer
single but various, which prompts the multi-label classification in social networks to become an important research
area. The previous research focuses on how to improve the precision of the predicted labels, while ignoring the
system overhead caused by obtaining the node information, such as time consumption and computing memory
occupancy. Now, as both expansion and complexity of the networks are increasing, the problem of previously
neglected system overhead is becoming the more and the more serious. It increases not only the cost but also the
difficulty of predicting labels. In this paper, an NSGA2-based multi-label seed selection algorithm in network
environments (NAMESEA) is proposed to improve the accuracy of label prediction on the condition that reducing
both the time consume and the memory occupancy. Compared with other multi-label prediction algorithms on
multiple real datasets, NAMESEA algorithm not only greatly reduces the system overhead but also improves the
prediction accuracy.
Key words: Social networks; Multi-label classification; NSGA2; System overhead
1 引言
近年来随着社交网络应用的发展普及,社交网
收稿日期:2016-11-24;改回日期:2017-04-11;网络出版:2017-05-11
*通信作者:李磊 lilei@hfut.edu.cn
基金项目:国家 973 规划项目(2013CB329604),国家重点研发计划
项目(2016YFB1000901),国家自然科学基金项目(61503114)
Foundation Items: The National 973 Program of China
(2013CB329604), The National Key Research and Development
Program of China (2016YFB1000901), The National Natural
Science Foundation of China (61503114)
络吸引了越来越多学者的研究目光
[1 5]−
,其中一个
重要的研究方向就是社交网络中的多标签预测问
题
[1]
。利用标签预测我们可以通过网络中已知用户标
签预测得到未知用户标签,从而对用户进行分类和
划分社区,进而有针对性地进行信息推荐。虽然已
经有一些算法
[1,6 8]−
来解决网络中的多标签预测问
题,但随着社交网络规模的不断扩大,以及数据结
构的复杂性不断增强,为了得到节点信息所引起的
系统开销特别是花费的时间和消耗的系统内存不断