1041-4347 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TKDE.2020.2979176, IEEE
Transactions on Knowledge and Data Engineering
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 14, NO. 8, AUGUST 2015 3
Given an object o and s, the score of o for s, scor e(s, o), is
calculated as follows [10], [13]:
score(s, o) = s.α · dist(s.p, o.p) + (1 − s.α ) · text(s.t, o.t)
(1)
where dist(s.p, o.p) is the spatial proximity and
text(s.t, o.t) is the textual similarity between s and o.
Given an object set O, the answer of s is a subset of O,
A, such that (1) |A| = k , and (2) ∀o
∗
∈ A, ∀o
′
∈ O\A,
score(s, o
∗
) ≤ score(s, o
′
)
1
.
In the following of this paper, we abbreviate s.k and s.α as
k and α, respectively, if there is no ambiguity.
To compute spatial proximity, we utilize the Euclidean
distance dist(s.p, o.p) =
Edist(s.p,o.p)
Maxdist
as with [3], [4], [5],
[10], [14], [15], [16]
2
. Edist(s.p, o.p) is the Euclidean distance
between s.p and o.p, and M axdist is the maximum distance
in the spatial area R
2
where subscriptions and objects exist.
We see that dist(·, ·) ∈ [0, 1]. For textual similarity, there
are three widely used set-based similarity functions, namely
Jaccard, Dice, and Cosine similarities [18]. Lamps can em-
ploy any similarity function that uniquely determines the
textual similarity by a given subscription and an object.
(Rev-2) In this paper, we use Jaccard similarity as the default
function, because it is usually used in set similarity search
[19], i.e., text(s.t, o.t) = 1 −
|s.t∩o.t|
|s.t∪o.t|
. Note that in Section 7.3
we verify the performance of Lamps when Dice and Cosine
similarities are employed. Then, score(s, o) is within [0, 1].
Problem statement. We consider a set of MkST subscrip-
tions S and a dynamic set O of spatio-textual objects. Sub-
scribers can move randomly at any time (i.e., movements
of users) [4]. Publishers can insert their newly generated
objects into O (i.e., object generation) and remove objects
that they have generated from O (i.e., object expiration).
We aim to continuously monitor the top-k results for all
subscriptions against object generation, object expiration,
and movements of users.
Example 2. Fig. 2 shows a running example used throughout
this paper. In this example, there are ten registered
subscriptions {s
1
, · · · , s
10
} and eight objects have been
generated {o
1
, · · · , o
8
}. Moreover, two objects o
9
and
o
10
are newly generated, and we assume s
1
.k = 2.
Specifically, o
2
and o
5
have high spatial and textual
similarities to s
1
. Thus, the top-k results of s
1
are o
2
and
o
5
.
2.2 Safe region technique
Lamps employs the concept of the safe region [12] to mon-
itor top-k results. Therefore, we first define the safe region
and a relevant concept, the dominant region.
1. As with [6], [13], to guarantee the top-k results are textual-relevant,
an object must contain at least one common keyword with a subscrip-
tion to become its top-k result.
2. The works in [11], [17] utilized the road network distance to
compute spatial proximity. Because the computaion cost of measureing
road network distance is higher than that of measureing Eucllidean
distance, we utilize the Euclidean distance to monitor the top-k results
of more subscriptions. However, Lamps can employ the road network
distance. In Section 6, we discuss the modifications needed when the
road network distance is used.
Keywords
Keywords
: subscription : object
Newly generated.
Fig. 2: Running example. Eight objects {o
1
, · · · , o
8
} have
been generated and two objects o
9
and o
10
are newly gener-
ated.
Definition 3 (Safe region). Given O, s = (p, t, k, α), and A of
s, the safe region of s, R, is:
R = {p
′
|∀o
∗
∈ A, ∀o
′
∈ O\A, score(s
′
, o
∗
) ≤ score(s
′
, o
′
)}
where s
′
is s after moving to a new location p
′
, i.e., s
′
=
(p
′
, t, k, α ).
Note that the safe region of s is a region where the top-k
result of s does not change.
Definition 4 (Dominant region). Given s = (p, t, k, α) and
two objects o
∗
and o
′
, the dominant region of o
∗
to o
′
, D
o
∗
,o
′
,
is:
D
o
∗
,o
′
= {p
′
|score(s
′
, o
∗
) ≤ score(s
′
, o
′
)}
where s
′
is s after moving to a new location p
′
, i.e., s
′
=
(p
′
, t, k, α ).
We see that the dominant region of o
∗
to o
′
is a region where
score(s, o
∗
) ≤ score(s, o
′
). The following lemma is showed
and proved in literature [10].
Lemma 1. Given O, s, and A, the safe region of s, R, is R =
∩
o
∗
∈A
(∩
o
′
∈O\A
D
o
∗
,o
′
).
Local safe region. To efficiently compute a safe region, lit-
erature [10] proposed a local safe region (LSR), which is a
subset of the safe region. Notice that we can monitor the
exact top-k results by re-evaluating them only when users
exit their LSR. [10] models each object o as a circle C
o
with
a center o.p and radius of r
o
=
1−α
α
· text(s.t, o.t). Then,
score(s, o) = α(dist(s.p, o.p) + r
o
). The LSR is computed in
the following three steps.
Step 1. For each object o
∗
∈ A, we compute an ellipse E
o
∗
and the intersection of these ellipses E = ∩
o
∗
∈A
E
o
∗
. Note
that
E
o
∗
= {p
′
|dist(s
′
.p
′
, o
∗
.p) + dist(s.p, s
′
.p
′
) ≤ γ − r
o
∗
}, (2)
where γ = max{dist(s.p, o
∗
.p) + r
o
∗
|o
∗
∈ A} + ∆ and ∆
is a parameter of an approximate ratio. Obviously E
o
∗
is an
ellipse with s.p and o
∗
.p as two foci.
Example 3. In Fig. 3, because s
1
.k = 2 and the top-k results
of s
1
are o
2
and o
5
, E is the intersection of E
o
2
and E
o
5
.
Step 2. Let C
s
be a circle centered at s.p with radius γ.
When s
′
.p
′
∈ E, each object o does not become the top-k
result of s
′
if C
o
is not inside C
s
. If C
o
is not inside C
s
,
γ ≤ dist(s.p, o.p) + r
o
. Then, score(s
′
, o
∗
) < score(s
′
, o) for
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 22,2020 at 12:24:04 UTC from IEEE Xplore. Restrictions apply.