模糊空间划分下的频繁轨迹模式挖掘新方法

178 浏览量更新于2024-08-27 收藏 1.43MB PDF 举报

在信息技术领域，频繁轨迹模式挖掘（Frequent Trajectory Pattern Mining）是一项重要的时空数据分析任务，它在交通管理、城市规划、商业分析等领域具有广泛的应用。然而，由于空间轨迹数据的模糊性和不确定性，这项任务面临着显著的挑战。传统的数据挖掘方法往往依赖于清晰的空间分区（Crisp Space Partition），例如通过经典的 PrefixSpan 算法或 GSP (Generalized Spatial Pattern) 算法来处理精确的位置信息，以此降低空间近似带来的问题。尽管这些方法有助于减少误差，但它们存在一个固有的缺陷：当空间上相邻的轨迹位置被分配到不同的分区时，即所谓的“尖锐边界”问题。这意味着即使两个轨迹在地理位置上非常接近，它们可能因为划分标准而被视为不同模式，这显然与我们对频繁模式的定义相违背，即相似的轨迹应该被视为同一模式。为解决这个问题，本文提出了一种基于模糊空间分区（Vague Space Partition）的频繁轨迹模式挖掘方法。模糊空间分区允许对于轨迹位置的不确定性进行更自然的处理，通过引入模糊度和权重机制，能够将邻近轨迹归为同一模式的可能性增大。这种方法旨在减少因空间划分导致的误分，提高模式发现的准确性。 Liang Wang等人在他们的研究中，针对这一挑战，设计了一种新颖的方法，该方法结合了前向生长（PrefixSpan）算法或者GSP算法的特点，同时考虑了轨迹位置的模糊性。他们首先对轨迹数据集进行模糊划分，然后采用一种自适应的阈值策略来确定轨迹之间的相似性，以此来识别频繁模式。这种方法能够在保持效率的同时，减少空间边界带来的影响，从而挖掘出更加符合实际需求的频繁轨迹模式。这篇论文探讨了如何利用模糊空间分区技术改进频繁轨迹模式挖掘，以克服传统清晰空间分区方法中的边界问题。通过这种创新方法，研究人员可以挖掘出更加准确和全面的轨迹模式，这对于优化决策支持和提升数据挖掘结果的实用性具有重要意义。未来的研究可以进一步探索如何优化模糊空间划分策略，以适应更大规模和复杂性的轨迹数据集，以及如何将这种方法与其他高级数据分析技术相结合，以发掘更多有价值的信息。

By utilizing the gridding spatial partition approach, the contin-

uous spatial domain can be simply represented by the discrete grid

cells. Accordingly, the problem of trajectory pattern mining can be

reduced into a sequence pattern mining problem, so that many

conventional sequence pattern mining techniques can be applied

to this problem.

Although the grid-based approach can solve the spatial approx-

imation concern to a certain degree as mentioned above, it causes

the sharp boundary problem. That is, when some approximate tra-

jectory locations are close to the boundary of predetermined grid

cells, they will be most likely assigned to different cells by the

strict boundary constraints. Therefore, some potential meaningful

patterns will not be discovered due to this issue. In other words,

the approximate trajectory locations may span more than one grid

cell using the crisp gridding spatial partition approach. Considering

a scenario shown in Fig. 1, two similar and close trajectory in-

stances, Tr

={p

, p

} and Tr

¼fp

; p

g, are regarded

as two different moving behaviors s

? s

and

? s

in the crisp space partition way. However, it is

obvious that these two trajectories share the same moving pattern.

This example clearly suggests that the crisp grid space partition

method is sensitive to spatial noise. If mobile objects with similar

trajectory pattern do not exactly follow the same trajectories, the

implicit pattern cannot be detected.

However, in practice, we do not expect a mobile object to visit

exactly the same location at every time instant during each time

period. In addition, due to the limited energy supply of sensor de-

vice and location precision errors, the trajectory readings from

these devices always carry uncertain information. In that case, the

sharp boundary problem will deteriorate drastically with increas-

ing noise. Actually, the regular crisp space partition method can

be seen as a hard partition way with strict boundary constraints.

From Fig. 1, we can learn that the hard partition way is very likely

to cause the sharp boundary problem. Therefore, we consider divid-

ing the spatial plane via a ﬂexible partition way to approximate

close trajectory locations and tackle the sharp boundary problem.

3.2. Problem deﬁnition

Deﬁnition 1. The time interval in a moving trajectory between any

two trajectory locations hx

, y

, t

i and hx

, y

, t

i is deﬁned as

t = t

 t

, where t

> t

. For example, the consecutive time intervals

of a moving trajectory Tr

={h p

,2i, hp

,5i, hp

,10i } are 3 and 5

respectively, and the transformed form of the moving trajectory

with time intervals is represented as p

Deﬁnition 2. A trajectory pattern in this work is deﬁned as r

where r

and r

denote the spatial grid cell labels, and

t is a time

interval between these two cells. A pattern with k-length is called

k-pattern. Actually, a k-pattern is comprised of k grid cells and

(k  1) time intervals.

Deﬁnition 3. The space membership value is deﬁned as the mem-

bership degree of the trajectory location hx

, y

i to the spatial grid

cell r

Actually, the space membership value is inversely proportional

to the relative distance between hx

, y

i and r

. By means of a mem-

bership function, the trajectory location hx

, y

i can be transformed

into the grid-labeled data hr

, m

i, where r

is the label of assigned

grid cell and m

denotes the corresponding membership value.

Note that the trajectory location hx

, y

i may be mapped into more

than one neighboring grid cell with membership values.

Deﬁnition 4. The number of partitioned spatial grid cells e is

deﬁned as the desired space granularity to view the spatial data.

The larger the value e is, the ﬁner the granularity of space is. On the

contrary, the smaller the number of the partitioned grid cells is, the

coarser the space granularity is.

However, selecting the value of e in data analysis is not trivial.

Too large granularity may damage the pattern semantics, while too

small granularity may result in a small number (or none in the

worst case) of patterns and high computation cost. In practice, e

can be determined based on domain knowledge.

Deﬁnition 5. In the vague space partition, the crisp zone radius r is

deﬁned as the ratio of crisp zone area to the unit grid cell area.

Unlike the regular grid cell, the vague grid cell is comprised of a

crisp zone and an intermediate zone. Trajectory location distrib-

uted in the crisp zone is exclusively assigned to the grid cell con-

taining it with a constant membership value 1. But for the

intermediate zone, location situated in it may be assigned to more

than one neighboring cell.

Deﬁnition 6. A frequent trajectory pattern is that whose total

support in database is not less than the user-speciﬁed minimum

support threshold minsup.

Given a trajectory database D, a vague space partition member-

ship function and a minimum support threshold minsup, the objec-

tive of mining frequent trajectory pattern is to ﬁnd the complete

set of the frequent patterns, i.e., all trajectory patterns r

4. Proposed method

4.1. Vague space partition method

In this subsection we will discuss how to partition the spatial

plane ﬂexibly by vague way, and how to convert the original trajec-

tory dataset into transformed trajectory sequences.

Inspired by the rough set theory, we consider dividing the reg-

ular grid cell into a crisp zone and an intermediate zone. By this

way, we can build a novel grid cell, namely vague grid cell. Fig. 2

depicts three vague grid cells, s

, s

and s

, which are distributed

uniformly over the spatial plane. The part inside the circle of vague

grid cell is deﬁned as the crisp zone, and the one outside the crisp

Fig. 1. Example of the sharp boundary problem.

102 L. Wang et al. / Knowledge-Based Systems 50 (2013) 100–111

剩余11页未读，继续阅读

weixin_38572960

粉丝: 2
资源: 915

模糊空间划分下的频繁轨迹模式挖掘新方法

Frequent Close Pattern Mining.ppt

Frequent Subgraph Mining Based on Pregel

MININGTRIBE BASED ON THE FREQUENT PATTERN

Frequent XML Pattern Mining-开源

Text-Mining-Frequent-Pattern-Analysis:Aprioiri 频繁模式分析 - Java 实现

Frequent Pattern Mining

Sliding window based weighted maximal frequent pattern mining over data streams

Mining Frequent Patterns without Candidate

Mining Frequent Itemset 算法课件

frequent pattern mining_TextCompression_

最新资源