0162-8828 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TPAMI.2016.2605085, IEEE
Transactions on Pattern Analysis and Machine Intelligence
3
matrix to construct novel fusing models and to predict the
genuine interest of users in a more reasonably way.
Previously, we have published a conference paper to
briefly present and preliminarily validate the aforemen-
tioned idea [37], which will be extensively extended in this
work by supplementing new models, algorithms, theoretical
analysis as well as sufficient experimental results. The rest
of the paper is organized as follows. In Section 2 we propose
a novel matrix factorization based social CF method named
TrustMF. In Section 3, we discuss the rationale of TrustMF
from a probabilistic view and then contribute a more general
framework, named TrustPMF, to predict a user’s preferences
more effectively by fitting rating data and trust data in an
approximate, rather than deterministic, way. In Section 4,
we validate the performance of TrustMF and TrustPMF by
comparing them with several representative methods on
four large real-world datasets.
2 THE TRUSTMF MODEL
2.1 Factorization of Rating Matrix
Suppose we have a recommender system which involves m
users and n items. Let R = [R
ij
]
m×n
denote the observed
user-item rating matrix, where each known element R
ij
denotes the observed rating of item j given by user i, which
is normally an integer from 1 to R
max
(e.g., 1 to 5), and
each unknown element indicates the user has not rated the
item yet. By mapping users and items to a low-dimensional
feature (factor) space in which each factor can be deemed as
any aspect, topic, or other implicit feature relevant to items
or users’ interests [17], each user or item can be represented
by a low-dimensional vector. Let d-dimensional vectors U
i
and V
j
be the user-specific latent feature vector of user i
and item-specific latent feature vector of item j, respectively.
Then we have user feature matrix U ∈ R
d×m
and item
feature matrix V ∈ R
d×n
.
By employing low-rank approximation over the ob-
served rating matrix, we can recover unknown ratings by
U
T
V . Formally, the feature matrices U and V can be learned
by minimizing a loss function as follows:
L =
X
(i,j)∈Ω
(U
T
i
V
j
− R
ij
)
2
+ λ(kUk
2
F
+ kV k
2
F
)
where Ω = {(i, j) : R
ij
6= 0} denotes the locations of ob-
served ratings in rating matrix, k · k
F
denotes the Frobenius
norm, and parameter λ controls the model complexity to
avoid over-fitting.
To further avoid over-fitting when learning parame-
ters, a so-called weighted-λ-regularization was introduced
to above model [41]. It increases the penalization to the
feature vectors of users or items involving more ratings.
Correspondingly, the new objective is obtained as follows:
L =
X
(i,j)∈Ω
(U
T
i
V
j
−R
ij
)
2
+λ(
X
i
n
u
i
kU
i
k
2
F
+
X
j
n
v
j
kV
j
k
2
F
)
where n
u
i
and n
v
j
denote the numbers of ratings given by
user i and given to item j, respectively.
2.2 Factorization of Trust Network
Suppose we have a trust network N with m nodes, where
nodes represent users and arcs represent directional trust re-
lationships among users. The adjacent matrix T = [T
ik
]
m×m
can be used to describe the structure of N, in which T
ik
is normally a real number within the interval [0, 1], repre-
senting the extent user i trusts k (e.g.,“1” indicates user i
extremely trusts k ). Since i trusts k does not always means
that k trusts i in the same way, T may not be symmetric.
Due to the asymmetric property of trust, in this work
we map each user i as two distinct d-dimensional latent
feature vectors, depicted by truster-specific feature vector
B
i
and trustee-specific feature vector W
i
, respectively. B
i
and W
i
characterize the behaviors of “to trust others” and
“to be trusted by others” of user i, respectively. Given
such vectors, one can model trust value T
ik
as the inner
product of B
i
and W
k
. It is important to notice that, this
way of modeling trust relations is totally distinct from
existing methods, which explicitly explores the real-world
implications of feature vectors. We will see later this new
way of modeling trust values not only characterizes the
directional property of trust in a more accurate way, but also
gives a better interpretation of how the mutual trust among
users are generated and how they affect a user’s respective
opinions.
Provided we have only trust data, the feature matrices
B ∈ R
d×m
and W ∈ R
d×m
can be learned by minimizing
the objective function:
L =
X
(i,k)∈Ψ
(B
T
i
W
k
− T
ik
)
2
+ λ(kBk
2
F
+ kWk
2
F
)
where Ψ = {(i, k) : T
ik
6= 0} denotes the locations of
observed trust relations in the trust matrix.
2.3 TrustMF Model
We have performed matrix factorization just based on either
rating data or trust data. In this section, we will present our
matrix factorization model to fuse both.
2.3.1 Truster Model
As mentioned before, users of social networking such as
Epinions are able to browse and generate opinions (ratings
or reviews) on items they are interested in and then build
their respective trust nets based on such opinions. Through
tangled trust ties, the opinions of an individual will be
affected by others and vice versa. Here we first propose a
truster model to characterize the first aspect, or how others
will affect a specific user’s opinions.
Note that the m users getting involved in rating matrix R
and trust matrix T are the same. Therefore, one can associate
R and T into a single matrix factorization process by sharing
a common user-specific latent feature space. Fig.2(a) shows
the proposed truster model that is capable of characterizing
how a user i’s opinions in terms of ratings are affected by
other users he/she trusts by means of B
T
i
V
j
.
In this model, we choose the truster-specific feature
matrix B as the latent space commonly shared by R and
T , meaning the user feature matrix U is approximated by
truster feature matrix B. Here each vector B
i
simultane-
ously characterizes two aspects: how a user i trusts (or