高效解答Top-k示例轨迹查询

172 浏览量更新于2024-08-29 收藏 902KB PDF 举报

"Answering Top-k Exemplar Trajectory Query" 这篇研究论文主要探讨了一种新的空间文本轨迹搜索类型——Exemplar Trajectory Query (ETQ)。ETQ允许用户指定一系列要访问的地点，并对每个地点的活动提供描述。其目标是高效地找出在所有点上计算空间和文本相似度后排名前k的轨迹。与传统方法相比，这种点对点的匹配计算成本显著增加。为了应对这一挑战，作者提出了一种增量剪枝基础方法，并探索了如何自适应地调整其策略。他们引入了基于差距的优化和新颖的两级阈值算法，以提高效率。这些方法支持对顺序敏感的ETQ，并且通过适度的扩展可以实现这一目标。实验部分在两个数据集上验证了所提解决方案的效率和可扩展性。这表明，尽管ETQ的计算复杂性较高，但通过创新的算法设计，能够有效地处理大规模轨迹数据。这项工作对于地理位置服务、旅行推荐系统以及任何需要基于地点活动信息的轨迹搜索应用都具有重要意义。论文中的关键知识点包括： 1. **Exemplar Trajectory Query (ETQ)**：一种新型的空间文本轨迹查询，结合了空间位置（地点）和文本描述（活动）。 2. **点对点匹配**：ETQ中每个地点的活动描述与轨迹中的对应点进行空间和文本相似度计算，这是计算成本高的主要原因。 3. **增量剪枝基础方法**：一种用于降低计算负担的策略，通过逐步排除低可能性的轨迹来加速查询过程。 4. **基于差距的优化**：通过设定一定的差距阈值，提前停止不满足条件的轨迹匹配，以提升效率。 5. **两级阈值算法**：结合一级和二级阈值，更精确地控制查询过程，进一步优化性能。 6. **顺序敏感的ETQ**：考虑轨迹中地点访问顺序的查询，这在实际应用中可能更为重要。 7. **实验验证**：在两个数据集上的实验结果证明了所提出的算法在效率和可扩展性方面的优势。 8. **应用场景**：适用于需要处理大量轨迹数据的领域，如智能交通、旅游规划和个性化推荐系统等。

TABLE II. SUMMARY OF NOTATION

Notation Deﬁnition

T Trajectory

Q Exemplar query

S (Q, T ) Trajectory similarity between Q and T (Deﬁnition 6)

, p

Points p

and p

, p

) Textual similarity between p

and p

, p

) Spatial similarity between p

and p

S(p

, p

) Spatial-textual similarity between p

and p

unseen UB The upper bound across all unseen trajectories’ similarity

seen

(Q, T )

A function returning the lower bound

of the trajectory similarity between Q and T

seen

(Q, T )

A function returning the upper bound

of trajectory similarity between Q and T

BG (c)

The gap between the lower and upper

bound in the c-th round of expansion (Deﬁnition 8)

Den (q

)

The similarity density of the query

point q

in the latest round of expansion

)

The minimum similarity of candidate points

for q

in the c-th round of expansion

) The ranked list for q

in the c-th round of expansion

tra

The set of all checked trajectories

max

The number of iterations to scan all points in candidate points

Spa

) The similarity sparsity of q

in c-th expansion

Deﬁnition 1: (Point) A point p = (loc, act) is a pair

consisting of a location loc and a set of associated keywords

act = (t

, t

, . . . , t

) describing the loc and/or the activities at

loc.

Deﬁnition 2: (Trajectory) A trajectory T of length n is in

the form of p

, p

, . . . , p

, where each p

is a point.

Deﬁnition 3: (Query) A query Q (of size m) is a set of

points in the form of {q

,. . . ,q

The similarity between T and Q is computed between

points which share at least one common keyword. While

a query point may have multiple textwise matching points,

recalling the related work on spatial-only trajectory search,

similarity is computed from one point to another point. There-

fore, we only choose the point with the maximum spatial-

textual similarity, and add all point-to-point similarities to get

the spatial-textual similarity between query and trajectories.

Deﬁnition 4: (Point-to-Point Similarity) We deﬁne the

similarity between two points p

, p

as:

S (p

, p

) =



0, p

.act ∩ p

.act=∅

α ·

+ (1 − α) ·

, otherwise

(1)

where

, p

) is the text similarity,

, p

) is the spatial

similarity between two points, and α ∈ (0, 1) is used to adjust

the relative importance of the spatial and textual similarity.

We use the sum of the textual relevance of each term [1, 18]

to measure the textual similarity, and the Euclidean distance to

measure the spatial similarity. The choice of similarity metric

is orthogonal to our query processing method (in Sec. IV).

, p

) =

t∈p

.act∩p

.act

γ(t) (2)

, p

) =

max

− Euclidean (p

, p

)

max

(3)

Here, γ(t) is the weight of keyword t in p

calculated by a

simple TF·IDF model [1]. The variable D

max

is the maximum

distance between any two unique points in geographical space,

and used to normalize the spatial scoring between 0 and 1.

Deﬁnition 5: (Point-to-Trajectory Similarity) The simi-

larity between a query point q

and a trajectory T is deﬁned

as:

TABLE III. SIMILARITY TABLE BETWEEN Q AND ALL TRAJECTORIES

TO T

SHOWN IN FIGURE 2 BASED ON DEFINITIONS 1-6 WHERE

α = 0.5. HERE, “ID” SHOWS THE POINT POSITION IN TRAJECTORY.

2 0.7 0.7 3 0.4 0.5 4 0.3 0.5 0.516

2 0.6 0.5 4 0.5 0.5 0.35

1 0.9 0.5 0.233

1 0.2 0.3 2 0.7 0.5 0.283

2 0.7 0.3 3 0.5 0.3 0.3

3 0.4 0.2 4 0.2 0.2 0.166

S (q

, T ) = max

∈T

S (q

, p

)

(4)

Deﬁnition 6: (Pointwise Similarity) The pointwise sim-

ilarity between T and Q is a sum of the point-trajectory

similarities between T and each point in Q, normalized by

|Q|:

S (Q, T ) =

∈Q

S (q

, T ) /|Q|. (5)

In trajectory T , |Q| points are chosen to compute the ﬁnal

similarity between T and Q. These |Q| points form a sub-

trajectory which can be taken as a representative result, and

denoted as T

Deﬁnition 7: (Top-k Exemplar Trajectory Query) Given

a trajectory database D = {T

, . . . , T

|D|

} and query Q, a

trajectory search retrieves a set R ⊆ D with k trajectories

such that: ∀r ∈ R, ∀r

∈ D − R,

S(Q, r) >

S(Q, r

Example 2: Figure 2 is an illustrative example of a query

and trajectories, showing: (a) The spatial shapes of the query

and trajectories; (b) The keywords attached to each point in T

;

and (c) The best match for each query point with T

based on

our pointwise similarity model. Further, Table III presents an

example of the similarity computations between a query Q and

the six trajectories (shown in Figure 2). For each query point,

we list the maximum similarity for every trajectory, and a blank

space means that they share no common keywords. We can

compute the similarity between Q and T

using

S (Q, T

) =

0.7+0.7

0.4+0.5

0.3+0.5

= 0.516 and the similarities of other

trajectories are listed in the right column of the table. As we

can see, T

, T

are the top-3 results.

IV. INCREMENTAL QUERY PROCESSING

The similarity (Deﬁnition 6) is an aggregation of spatial-

textual similarities from all query points, and is inspired by

spatial-only trajectory search [3, 11, 12, 15]. The threshold

algorithm of Fagin et al. [6] can be used directly as a ﬁltering

framework for ranked lists. While in principle a similar idea

can be modiﬁed to suit our purposes, using the algorithm of

Fagin et al. directly does not work since the top-k list for every

point in the query is not known a priori. However another

solution, the incremental k nearest neighbor search algorithm

IKNN [3, 11, 12] can be used to ﬁll partially ranked lists with

exactly λ nearest points for every query point. The ranked lists

can be expanded by increasing λ until all unseen trajectories

can not beat the current results. In this section, we show how

to extend IKNN from spatial-only to spatial-textual search, and

propose several bounds to terminate the expansion, which form

a baseline processing framework for ETQ.

A. Incremental Lookup Algorithm

The Incremental Lookup Algorithm (ILA) can be divided

into three steps, as shown in Algorithm 1.

剩余11页未读，继续阅读

weixin_38689976

粉丝: 6
资源: 924

高效解答Top-k示例轨迹查询

"最全中文API：ArcGis-for-JavaScript示例讲解及应用指南

BPMN-js实战示例：深入理解流程图与应用集成

MATLAB最小捕捉轨迹规划示例代码解析

实现Top-K PLL算法：快速处理网络top-k距离查询

c# usb-hid通信上位机示例

Android - TabHost选项卡示例源码

Android Wi-Fi Direct 开发示例代码

jquery-easyui后台成型模板示例

json-lib-2.4-jdk15包含依赖JAR包和简单示例

S7-200斜坡函数库-RAMP示例程序深度解析

最新资源