A Collaborative Filtering Algorithm Based on
Social Network Information
Rui Wang
Department of Computer Science and
Technology
Harbin Institute of Technology at
Weihai
Weihai, China
rena_wang521@163.com
Bailing Wang
Department of Computer Science and
Technology
Harbin Institute of Technology at
Weihai
Weihai, China
wbl@hit.edu.cn
Junheng Huang
Department of Computer Science and
Technology
Harbin Institute of Technology at
Weihai
Weihai, China
hithjh@163.com
Abstract—In traditional collaborative filtering recommenda-
tion, the matrix sparsity and cold start restricted the accuracy
of system. In this paper, we develop a way to enhance the
recommendation effectiveness by merging neighborhood
relationship and user`s keyword of social network information
into collaborative filtering. We extend the calculation method
of the TOP N neighbors which is the most important from two
aspects. Our method expands the information capacity which
can be used by collaborative filtering, improves the accuracy of
recommendation and eases the cold start problem in
recommendation system. We conducts experiment based on
KDD 2012 real data set. The result indicates that our
algorithm performs more superior than traditional
collaborative filtering algorithm.
Keywords-social network; recommendation system; data
mining; collaborative filtering
I. INTRODUCTION
With the exponential growth of the Internet data, human
society has stepped into the Age of Big Data. It is
increasingly difficult for users to find out the information
they need in the huge data set. Different users are provided
with the same ranking results through the traditional search
engine technology, but users hope to get the personalized
recommendation according to their own preferences.
Researchers have come up with a variety of recommendation
algorithms and developed the corresponding personalized
recommendation systems some of which have been
successfully applied in the industry. In these
recommendation systems, collaborative filtering (CF)
becomes the most popular one for its easy implementation
and good expandability [1]. When predicting the user u’s
preference to the item i, this algorithm firstly will find the
users set N
u
which shares the similar rating behaviors with
the user u according to the previous rating records, and then
estimate the user u’s preference to the item i according to all
the users’ preference to the item i in the users set N
u.
[2].
CF
has been successfully applied on Amazon and other sites,
however, there are some problems existing in it: (1) Data
sparsity. The rating matrix composed of user’s rating to the
item in most cases is very sparse, and it cannot calculate the
neighborhood of the user correctly and effectively. (2) Cold
start. For lacking the rating record of the new user, the
neighborhood of this user cannot be calculated and this user
cannot get an effective recommendation, either. (3) CF
calculates the neighborhood on the basis of the similar
interests, therefore it cannot distinguish neighborhood of
friend and stranger with similar interests.
The previous recommendation systems are all based on
the hypothesis: users are independent identically distributed.
But actually, on some problems, people usually ask for their
friends’ advice which plays an important role in the final
decision. The rapid development of SNS represented by
Facebook, Twitter, and Tencent provides a great social
platform for people’s communication. Friend relationship
and user-related information in SNS can provide more
available information for recommendation system. Recently,
it has attracted high attention of scholars that information on
social networks can be used to improve the performance of
the recommendation system [3]-[6].
This paper raises a collaborative filtering algorithm
combined with the neighborhood and user’s tag in social
network, and extends the key problem “user’s
neighborhood” in the algorithm from two aspects, The
algorithm expands the information capacity which can be
used by CF, improves the accuracy of recommendation and
eased the cold start problem in recommendation system. We
conducted experiments based on KDD 2012 real data set.
The result indicates that our algorithm performs more
superior than the traditional CF algorithm.
II. RELATED WORK
A. Propaedeutics
The most fundamental elements in the recommendation
system are users set and items set. We set users set
to
, and m is the number of the users. We set
items set to
, and n is the number of the items.