TABLE II. SUMMARY OF NOTATION
Notation Definition
T Trajectory
Q Exemplar query
ˆ
S (Q, T ) Trajectory similarity between Q and T (Definition 6)
p
i
, p
j
Points p
i
and p
j
ˆ
S
T
(p
i
, p
j
) Textual similarity between p
i
and p
j
ˆ
S
S
(p
i
, p
j
) Spatial similarity between p
i
and p
j
ˆ
S(p
i
, p
j
) Spatial-textual similarity between p
i
and p
j
unseen UB The upper bound across all unseen trajectories’ similarity
LB
seen
(Q, T )
A function returning the lower bound
of the trajectory similarity between Q and T
UB
seen
(Q, T )
A function returning the upper bound
of trajectory similarity between Q and T
BG (c)
The gap between the lower and upper
bound in the c-th round of expansion (Definition 8)
Den (q
i
)
The similarity density of the query
point q
i
in the latest round of expansion
ˆ
S
c
(q
i
)
The minimum similarity of candidate points
for q
i
in the c-th round of expansion
R
c
(q
i
) The ranked list for q
i
in the c-th round of expansion
C
tra
The set of all checked trajectories
it
max
The number of iterations to scan all points in candidate points
Spa
c
(q
i
) The similarity sparsity of q
i
in c-th expansion
Definition 1: (Point) A point p = (loc, act) is a pair
consisting of a location loc and a set of associated keywords
act = (t
1
, t
2
, . . . , t
i
) describing the loc and/or the activities at
loc.
Definition 2: (Trajectory) A trajectory T of length n is in
the form of p
1
, p
2
, . . . , p
n
, where each p
i
is a point.
Definition 3: (Query) A query Q (of size m) is a set of
points in the form of {q
1
,q
2
,. . . ,q
m
}.
The similarity between T and Q is computed between
points which share at least one common keyword. While
a query point may have multiple textwise matching points,
recalling the related work on spatial-only trajectory search,
similarity is computed from one point to another point. There-
fore, we only choose the point with the maximum spatial-
textual similarity, and add all point-to-point similarities to get
the spatial-textual similarity between query and trajectories.
Definition 4: (Point-to-Point Similarity) We define the
similarity between two points p
i
, p
j
as:
ˆ
S (p
i
, p
j
) =
0, p
i
.act ∩ p
j
.act=∅
α ·
ˆ
S
S
+ (1 − α) ·
ˆ
S
T
, otherwise
(1)
where
ˆ
S
T
(p
i
, p
j
) is the text similarity,
ˆ
S
S
(p
i
, p
j
) is the spatial
similarity between two points, and α ∈ (0, 1) is used to adjust
the relative importance of the spatial and textual similarity.
We use the sum of the textual relevance of each term [1, 18]
to measure the textual similarity, and the Euclidean distance to
measure the spatial similarity. The choice of similarity metric
is orthogonal to our query processing method (in Sec. IV).
ˆ
S
T
(p
i
, p
j
) =
X
t∈p
i
.act∩p
j
.act
γ(t) (2)
ˆ
S
S
(p
i
, p
j
) =
D
max
− Euclidean (p
i
, p
j
)
D
max
(3)
Here, γ(t) is the weight of keyword t in p
j
calculated by a
simple TF·IDF model [1]. The variable D
max
is the maximum
distance between any two unique points in geographical space,
and used to normalize the spatial scoring between 0 and 1.
Definition 5: (Point-to-Trajectory Similarity) The simi-
larity between a query point q
i
and a trajectory T is defined
as:
TABLE III. SIMILARITY TABLE BETWEEN Q AND ALL TRAJECTORIES
T
1
TO T
6
SHOWN IN FIGURE 2 BASED ON DEFINITIONS 1-6 WHERE
α = 0.5. HERE, “ID” SHOWS THE POINT POSITION IN TRAJECTORY.
q
1
q
2
q
3
Q
ID
ˆ
S
S
ˆ
S
T
ID
ˆ
S
S
ˆ
S
T
ID
ˆ
S
S
ˆ
S
T
ˆ
S
T
1
2 0.7 0.7 3 0.4 0.5 4 0.3 0.5 0.516
T
2
2 0.6 0.5 4 0.5 0.5 0.35
T
3
1 0.9 0.5 0.233
T
4
1 0.2 0.3 2 0.7 0.5 0.283
T
5
2 0.7 0.3 3 0.5 0.3 0.3
T
6
3 0.4 0.2 4 0.2 0.2 0.166
ˆ
S (q
i
, T ) = max
p
j
∈T
n
ˆ
S (q
i
, p
j
)
o
(4)
Definition 6: (Pointwise Similarity) The pointwise sim-
ilarity between T and Q is a sum of the point-trajectory
similarities between T and each point in Q, normalized by
|Q|:
ˆ
S (Q, T ) =
X
q
i
∈Q
ˆ
S (q
i
, T ) /|Q|. (5)
In trajectory T , |Q| points are chosen to compute the final
similarity between T and Q. These |Q| points form a sub-
trajectory which can be taken as a representative result, and
denoted as T
Q
.
Definition 7: (Top-k Exemplar Trajectory Query) Given
a trajectory database D = {T
1
, . . . , T
|D|
} and query Q, a
trajectory search retrieves a set R ⊆ D with k trajectories
such that: ∀r ∈ R, ∀r
0
∈ D − R,
ˆ
S(Q, r) >
ˆ
S(Q, r
0
).
Example 2: Figure 2 is an illustrative example of a query
and trajectories, showing: (a) The spatial shapes of the query
and trajectories; (b) The keywords attached to each point in T
1
;
and (c) The best match for each query point with T
1
based on
our pointwise similarity model. Further, Table III presents an
example of the similarity computations between a query Q and
the six trajectories (shown in Figure 2). For each query point,
we list the maximum similarity for every trajectory, and a blank
space means that they share no common keywords. We can
compute the similarity between Q and T
1
using
ˆ
S (Q, T
1
) =
0.7+0.7
2
+
0.4+0.5
2
+
0.3+0.5
2
3
= 0.516 and the similarities of other
trajectories are listed in the right column of the table. As we
can see, T
1
, T
2
, T
5
are the top-3 results.
IV. INCREMENTAL QUERY PROCESSING
The similarity (Definition 6) is an aggregation of spatial-
textual similarities from all query points, and is inspired by
spatial-only trajectory search [3, 11, 12, 15]. The threshold
algorithm of Fagin et al. [6] can be used directly as a filtering
framework for ranked lists. While in principle a similar idea
can be modified to suit our purposes, using the algorithm of
Fagin et al. directly does not work since the top-k list for every
point in the query is not known a priori. However another
solution, the incremental k nearest neighbor search algorithm
IKNN [3, 11, 12] can be used to fill partially ranked lists with
exactly λ nearest points for every query point. The ranked lists
can be expanded by increasing λ until all unseen trajectories
can not beat the current results. In this section, we show how
to extend IKNN from spatial-only to spatial-textual search, and
propose several bounds to terminate the expansion, which form
a baseline processing framework for ETQ.
A. Incremental Lookup Algorithm
The Incremental Lookup Algorithm (ILA) can be divided
into three steps, as shown in Algorithm 1.