截断轨迹分类器：识别与影响分析

需积分: 0 29 浏览量更新于2024-06-26 收藏 639KB DOCX 举报

身份认证购VIP最低享 7 折!

领优惠券(最高得80元）

"截断轨迹分类器 - 从事件日志中移除不完整轨迹" 在当前的业务环境中，数据流的处理和分析已经成为关键。事件日志作为一种记录业务流程活动的重要工具，对于流程挖掘（Process Mining）具有重要意义。然而，由于数据流的实时性和日志提取过程中的限制，往往会导致日志中的“截断轨迹”（Truncated Traces），即不完整的事件序列。这些不完整的轨迹可能会严重影响后续的流程挖掘结果，因为它们提供了不完全的业务流程视图。论文"Truncated Trace Classifier. Removal of Incomplete Traces from Event Logs"专注于解决这一问题，提出了一个名为截断轨迹分类器（Truncated Trace Classifier, TTC）的算法。TTC的目标是区分截断轨迹和未被截断的轨迹，从而帮助提高流程挖掘的精确性。通过识别并移除不完整的轨迹，可以避免它们对模型发现过程产生的负面影响。论文中，作者实施了5个不同的TTC版本，分别基于LSTM（长短期记忆网络）和XGBOOST（一种梯度提升决策树模型）进行实验。他们在13个实际业务场景的事件日志上进行了基准测试。结果显示，准确的TTC能够显著提升流程发现的精度，平均提高了9.1%。这表明，TTC在过滤不完整轨迹后，能更准确地恢复业务流程模型。此外，论文还探讨了TTC在预测下一个事件的准确性上的作用。通过TTC，事件预测算法的性能提升了高达7.5%，进一步证明了TTC在提升业务流程分析效果方面的潜力。 TTC的贡献在于填补了研究空白，为处理事件日志中的不完整轨迹提供了一种有效方法。未来的研究可能将扩展到更复杂的事件日志结构，优化TTC的性能，并探索其在不同领域和规模的业务流程中的应用。这一工作不仅对流程挖掘领域，而且对依赖于准确事件数据进行决策的企业都具有深远的影响。

资源详情

资源推荐

‘missing heads’ [10]. We favor the term truncated over the term incomplete as the latter

is often used for the concept of ‘event log incompleteness’, referring to the fact that an

event log will most likely not contain all the combinations of behaviors that are possible

because there are too many of them [12]. For instance, when there is a loop in the

process model, the number of unique combinations is infinite. Event logs will most

likely be incomplete while they may not contain truncated traces.

截断的轨迹是指一个正在进行的轨迹，其过程的终点被遗失。截断的轨迹有时被

称为 "不完整的案例"[7,8]，"不完整的轨迹"[5]，或 "丢失的头"[10]。我们更倾

向于使用截断的术语而不是不完整的术语，因为后者经常用于 "事件日志不完整

"的概念，指的是一个事件日志很可能不包含所有可能的行为组合，因为有太多

的行为组合了[12]。例如，当过程模型中有一个循环时，独特组合的数量是无限

的。事件日志很可能是不完整的，而它们可能不包含截断轨迹。

There are several reasons to explain the existence of incomplete traces. They might

exist because of a flawed event log extraction process that cuts the traces at a fixed date,

leaving the traces that finish after truncated. This issue–named ‘the snapshots

challenge’–has been identified by van der Aalst as one of the five challenges that occurs

when extracting event logs [6, chapter 5.3]. This type of truncated trace could be

avoided by extracting only the traces where no event happens after the extraction date.

However, once the data is extracted, we cannot know which traces are truncated. As

another example, incomplete traces can exist because the events have not happened yet.

This is especially relevant when working with streaming data. Finally, truncated traces

can result from a wrong execution (e.g., the ticket was supposed to be closed but the

agent forgot to do it) or when the information system fails. In the next section, we

introduce a classifier to automatically detect truncated traces.

有几个原因可以解释不完整轨迹的存在。它们的存在可能是由于一个有缺陷的事

件日志提取过程，在一个固定的日期切断了痕迹，留下了截断后的痕迹。这个问

题--被称为 "快照挑战"--已经被 van der Aalst 确定为提取事件日志时出现的五个

挑战之一[6, 5.3 章]。这种类型的截断轨迹可以通过只提取在提取日期后没有事件

发生的跟踪来避免。然而，一旦数据被提取出来，我们就无法知道哪些痕迹被截

断了。另一个例子是，不完整的轨迹可能存在，因为事件还没有发生。这在处理

流数据时尤其重要。最后，截断的痕迹可能是由于错误的执行（例如，票据应该

被关闭，但是代理忘记了），或者当信息系统发生故障时。在下一节中，我们将

介绍一个分类器来自动检测截断轨迹。

3 Truncated Trace Classifier

3 截断的轨迹分类器

A TTC inputs the current execution of a trace and predicts whether it is truncated. As

shown in Table 1, we generate one input sample and one target for each prefix length

of each trace. The input sample represents the current state of the process on which we

apply a TTC. The target is a binary label that is ‘true’ when the trace is truncated or

‘false’ otherwise.

一个 TTC 输入一个轨迹的当前执行情况，并预测它是否被截断。如表 1 所示，

我们为每个轨迹的每个前缀长度生成一个输入样本和一个目标。输入样本代表我

们应用 TTC 的进程的当前状态。目标是一个二进制标签，当跟踪被截断时为 "

真"，否则为 "假"。

剩余16页未读，继续阅读

ProgrammerMonkey

粉丝: 43
资源: 37

会员权益专享

截断轨迹分类器：识别与影响分析

line#164527 has been truncated abaqus导入.inp文件时出现这个问题该怎么解决

de1: file not recognized: File truncated collect2.exe: error: ld returned 1 exit status

����1: file not recognized: File truncated collect2.exe: error: ld returned 1 exit status

with tf.name_scope('final_training_ops'): weights = tf.Variable( tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, n_classes], stddev=0.1))

truncated Huffman tree

frame will be truncated. increase nfft to avoid.

Data truncated for column 'trace_id' at row 1

variance.fit(

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

AttributeError: module 'tensorflow' has no attribute 'truncated_normal'

(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

LocalDateTime.truncatedTo()

tf.Variable报错module 'tensorflow' has no attribute 'truncated_normal'

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 24-25: truncated \xXX escape

截取localdatetime

CCD式铆合测定机保养说明书.doc

IOS操作系统开发/调试的案例

会员权益专享

最新资源

��1: file not recognized: File truncated collect2.exe: error: ld returned 1 exit status