mendation (see Section 2.3 for details). This kind of characteristics
are independent from individual users and should have potential
benefits to location recommendation. Motivated by this, in this pa-
per, we propose to study geographical characteristics from location
perspective, with an emphasis on the geographical neighborhood.
Through our empirical analysis, we found that nearest neighbor-
ing locations tend to share more common visitors. This observa-
tion indicates that, in general, a user shares similar preferences
on nearest neighboring locations. In our method, this pattern is
used as the instance-level geographical neighborhood charac-
teristics. Specifically, by introducing a similarity measure, the re-
lationship underlying a user’s preferences on a few nearest neigh-
boring locations is captured and utilized to characterize her prefer-
ence on a target location.
In addition, our study on check-in data also shows that locations
in a geographical region may share similar user preferences. Each
geographical region is usually associated with a specific function
(e.g., business, entertainment, and education) [4, 29]. Such geo-
graphical regions contain additional prior knowledge about the in-
teractions between users and locations. Therefore, we use the geo-
graphical region structure as the region-level geographical neigh-
borhood characteristics. Specifically, a group lasso penalty is
employed to integrate the geographical region structure within the
check-in data into the learning of latent factors of users and loca-
tions. To summarize, the major contributions made in this paper
are as follows:
• We empirically analyze geographical characteristics from loca-
tion perspective using historical check-in data collected from
Gowalla. Two levels (i.e., instance-level and region-level) of
geographical neighborhood characteristics have been studied.
• We propose a novel location recommendation framework, i.e.,
I
nstance-Region Neighborhood Matrix Factorization (IRenMF),
to incorporate aforementioned geographical neighborhood char-
acteristics for improving location recommendation accuracy. In
particular, to solve the optimization problem in IRenMF, we pro-
pose an alternating optimization strategy, in which an acceler-
ated proximal gradient (APG) method is us ed to learn the latent
factors of locations, considering the geographical region struc-
ture of the check-in data.
• We extensively evaluate IRenMF on the datasets collected from
Gowalla. Experimental results show that: (1) Compared to the
baseline matrix factorization model (i.e., WRMF) without uti-
lizing geographical characteristics, our methods that use either
instance-level or region-level geographical neighborhood char-
acteristics significantly improves recommendation accuracy; the
improvement is on average by 30.78% and 17.2% respectively,
in terms of precision metric P@5. (2) Utilizing both two lev-
els of geographical neighborhood characteristics, IRenMF sub-
stantially outperforms two state-of-the-art location recommen-
dation methods, which consider geographical characteristics de-
rived from user perspective.
The rest of this paper is organized as follows. Section 2 reviews
the most relevant work about this study. Section 3 introduces our
empirical analysis of check-in data and provides some background
about matrix factorization. Next, Section 4 presents the details of
IRenMF and describes the optimization solution to IRenMF. Then,
in Section 5, we report the experimental results conducted on real-
world datasets, which demonstrate that the geographical character-
istics derived from location perspective can significantly improve
location recommendation accuracy. Finally, Section 6 draws the
conclusion of this study.
2. RELATED WORK
This study relates with three research areas: collaborative filter-
ing, structured sparsity learning, and personalized location recom-
mendation. Next, we will present an overview of the most related
work in each area.
2.1 Collaborative Filtering
The collaborative filtering (CF) technique has been widely used
for personalized recommendation tasks [1]. In the literature, there
are two main categories of CF methods, namely memory-based CF
and model-based CF.
The most popular memory-based CF approaches are user-based
CF and item-based CF. The user-based CF predicts a user’s prefer-
ence on a given item based on the preferences of other similar users
on the target item. In contrast, the item-based CF first finds similar
items to the target item and then predicts a user’s preference on the
target item, according to her preferences on other similar items. In
previous work [26,27], the user-based CF has been successfully ap-
plied for the location recommendation tasks, and it has been found
to significantly outperform item-based CF. Similar results have also
been found in our experiments (see Section 5.2.1).
The model-based CF approaches aim to learn an algorithmic
model to explain the observed user-item interactions. Most exist-
ing model-based CF models deal with rating prediction problems,
in which users’ preferences are explicitly indicated by the rating
values given to items. However, in most practical scenarios, users’
rating data are unavailable. The personalized recommendations are
based on users’ implicit feedback, e.g., purchase history [20], click
history [5], and check-in history [3]. In [8], Hu et al. introduced the
first attempts applying model-based CF approaches for large scale
implicit feedback datasets. Instead of ignoring all missing entries
in the user-item interaction matrix, they treated all missing entities
as negative examples and assigned vastly varying confidence lev-
els to the positive and negative examples. Then, a least square loss
function was used to learn the latent features of users and items.
To simplify the computation, Pan et al. [19] proposed a sampling
framework to collect a fraction of missing entries in the user-item
matrix as negative examples. Recently, Rendle et al. [21] presented
a Bayesian Personalized Ranking (BPR) approach for recommen-
dation with users’ implicit feedback. This method only assumed
that the unobserved items (or negative items) were less interesting
than the observed items (or positive items) to an individual user. Its
objective was to rank the positive items above the negative items in
the positive-negative item pairs of each user.
2.2 Structured Sparsity Learning
For feature learning problems in high-dimensional space, spar-
sity is a desirable property of feature vectors. The L
1
-norm penalty
has been widely used to enforce sparsity to the learned feature
vectors. Numerous methods have been proposed to solve the L
1
-
regularization regression problems in s tatistics, machine learning
and computer vision. The readers can refer to [6] (Chapter 18) f or
more dis cussions.
However, the standard L
1
-norm penalty does not consider any
dependency (e.g., the group structure) among inputs. This limits
its applicability in many complex application scenarios. In recent
years, a lot of structured sparsity learning approaches have been
proposed to induce different joint sparsity patterns to related vari-
ables, by employing more s tructured constraints on inputs. Yuan
and Lin [30] proposed the group lasso model that assumes disjoint
group structures of inputs and enforces sparsity on the pre-defined
groups of features. It employed a L
1
/L
p
mixed norm penalty, with
p > 1, to achieve group level variable selection. In practice, L
1
/L
2