![](https://csdnimg.cn/release/download_crawler_static/16626736/bg3.jpg)
pattern, the time interval between two continuous items in a
pattern is also considered. It usually indicates minimum and
maximum time intervals between two adjacent items. We
may extend such methods to our problem by setting multiple
gap-values to discover all fixed-gap patterns in the sequence.
But it is obvious that such an approach is time consuming due
to the exponential search space. Though [18] is able to dis-
cover rigid wild-card patterns with fixed gap constrains in
biological sequences, it can only generate long patterns by
convoluting known elementary patterns. Our algorithms,
however, can discover all fixed-gap episodes and valid pre-
cise-positioning episode rules without any prior knowledge.
Frequent Episode Mining. Mining frequent episodes from
event sequence was first introduced by Mannila et al. [1]
where episodes are defined as directed acyclic graphs and
two kinds of counting support are considered, i.e., sliding
windows and minimal occurrence. After that, various fre-
quency measures are defined to discover different kinds of
episodes according to different applications, and minimal
occurrence is one of widely used measures [4], [7], [14], [20],
[21]. Mining general episodes can be intricate and computing-
intensive, for example, discovering whether a sequence cov-
ers a general episode is NP-hard [22]. Existing algorithms can
be categorized into two types, namely breadth-first enumera-
tion [1], [20], [23] and depth-first enumeration methods [4],
[7], [24]. Among them, the depth-first enumeration methods
can be used to discover episode minimal occurrence. How-
ever, most of these algorithms require a post-processing step
to verify detected occurrences [9], which still have a signifi-
cant space for improvement. On the other hand, researches
have been focused on mining subclasses of episodes, for
example, serial episodes [25], closed episodes [6], [22], maxi-
mal episodes [21], episodes with unique labels [26], [27].
Episode Rule Mining. Inchoate episode rules are considered
a “second-stage” output derived from frequent episodes [1],
[7]. Episode rules are usually represented in the form of a
time range in which the consequent will happen. Meger
et al. [13], [28] constructed episode rules with gap constraint
episodes and proposed the algorithm to find the optimal win-
dow size. Fournier-Viger et al. [29] mined partially-ordered
sequential rules in which items are unordered in both the
antecedent and the consequent. Such kind of rules may
improve prediction accuracy in some applications. Lin
et al. [30] focused on the utility of episode rules and proposed
an algorithm to directly mine high utility episode rules. Our
work focuses on mining precise-positioning episode rules
motivated by critical applications in which we need to trigger
possible right responses at a more fine-grained right time.
3PRELIMINARIES AND PROBLEM STATEMENT
In this section, we first give some preliminary definitions
in frequent episode mining (Definitions 1, 2, 3, 4, and 5) [1],
[7], [9], [20]. Then, we propose some new concepts about
precise-positioning episode rules (Definitions 6, 7, 8, 9, 10,
11, and 12), and finally formulate the mining problem.
3.1 Preliminaries
Definition 1 (Event Sequence).
Let E be a finite set of events.
An event sequence, denoted
~
S ¼hðE
1
;t
1
Þ; ðE
2
;t
2
Þ; ...; ðE
n
;t
n
Þi,
is an ordered sequence of events, where each E
i
6¼;and E
i
E
consists of all events associated with timestamp t
i
, and t
j
<t
k
for any 1 j<k n.
For example, Fig. 1 shows an event sequence
~
S ¼ hðfDg; 1Þ; ðfA; Dg; 3Þ; ðfA; Bg; 4Þ; ðfEg; 5Þ; ðfB; D; Eg; 6Þ; ðfAg; 7Þ;
ðfBg; 8Þ; ðfE; Fg; 9Þ; ðfCg; 10Þ; ðfA; Fg; 11Þ; ðfFg; 12Þi
.
Definition 2 (Episode). An episode a is defined as a non-
empty totally ordered set of events of the form he
a
1
; ...;
e
a
j
; ...;e
a
k
i where e
a
i
2Efor all i 2½1;k and the event e
a
i
occurs before the event e
a
j
for any 1 i<j k. An episode a
of length k is referred to a k-episode.
Definition 3 (Episode Occurrence). Given an episode a ¼
he
a
1
; ...;e
a
j
; ...;e
a
k
i and a sequence
~
S; ½t
a
1
; ...;t
a
i
; ...;t
a
k
is an occurrence of a if and only if (1) e
a
i
is an element
of the event set E
a
i
at time t
a
i
for all i 2½1;k; (2) t
a
1
<t
a
2
< <t
a
k
. The time window ½t
a
1
;t
a
k
is called an occur-
rence window of a. In this study, we only consider the episode
occurrences whose window size is smaller than a user-specified
threshold d, namely t
a
k
t
a
1
< d. The set of all occurrences of
a in the sequence
~
S is denoted by ocSetðaÞ.
For example, if d ¼ 6 in Fig. 1, ocSetðhD; A; BiÞ¼f½1; 3; 4;
½3; 4; 6; ½3; 4; 8; ½3; 7; 8; ½6; 7; 8g.
Definition 4 (Minimal Episode Occurrence (MEO)).
Consider two time windows ½t
i
;t
j
and ½t
0
i
;t
0
j
. ½t
0
i
;t
0
j
is sub-
sumed by ½t
i
;t
j
if t
i
t
0
i
and t
0
j
t
j
. An occurrence window
½t
i
;t
j
of an episode a is a minimal episode occurrence of a if
no other occurrence window ½t
0
i
;t
0
j
of a is subsumed by ½t
i
;t
j
.
For example, moSetðhD; A; BiÞ¼f½1; 4; ½3; 6; ½6; 8g when
d ¼ 6 for the sequence in Fig. 1. The time window ½3; 8 con-
tains occurrences of hD; A; Bi, but it is not a minimal occur-
rence since hD; A; Bi also occurs in ½3; 6.
Definition 5 (Support of Episode). The support of an epi-
sode a, denoted as spðaÞ, is defined as the number of its distinct
MEOs, i.e., spðaÞ¼jmoSetðaÞj. An episode is frequent if and
only if its support is not less than a user-specified parameter
min
sup.
For example, the episode hD; A; Bi is frequent when
min
sup ¼ 3 in Fig. 1.
3.2 Definitions and Problem Statement
Definition 6 (Fixed-Gap Episode).
A fixed-gap episode
b is defined as a tuple in the form ðhe
b
1
; ...;e
b
i
; ...;e
b
k
i;
hDt
1
; ...; Dt
i
; ...; Dt
k1
iÞ where e
b
i
2Efor i 2½1;k and the
event e
b
i
occurs before the event e
b
j
for any 1 i<j k.
Additionally, the time span of the occurring time between event
e
b
jþ1
and event e
b
j
is Dt
j
;j2½1;k 1.
1
We denote a fixed-gap
episode with length k as fixed-gap k-episode.
For example, in Fig. 1, ðhE; Ai; h2iÞ is a fixed-gap
2-episode. The time span between E and A is 2.
Definition 7 (Fixed-Gap Episode Occurrence (FEO)).
Given a fixed-gap episode b ¼ðhe
b
1
; ...;e
b
i
; ...;e
b
k
i; hDt
1
; ...;
Dt
i
; ...; Dt
k1
iÞ; ½t
b
1
; ...;t
b
i
; ...;t
b
k
is an occurrence (FEO)
of b if and only if (1) e
b
i
is an element of event set E
b
i
at time
t
b
i
for all i 2½1;k; (2) t
b
j
<t
b
jþ1
and t
b
jþ1
t
b
j
¼ Dt
j
for all
j 2½1;k 1 . Similar to Definition 3, t
b
1
and t
b
k
constitute an
occurrence window of b, which is denoted as ½t
b
1
;t
b
k
.
1. A single event e is a kind of special fixed-gap episode, and we
denote it as ðhei; nullÞ.
532 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 30, NO. 3, MARCH 2018