Process Mining: Making Knowledg e Discovery
Process Centric
Wil van der Aalst
Department of Mathematics and Computer Science
Eindhoven University of Technology
PO Box 513, 5600 MB, Eindhoven, The Netherlands
w.m.p.v.d.aalst@tue.nl
ABSTRACT
Recently, the Task Force on Process Mining released the
Process Mining Manifesto. The manifesto is support ed by
53 organizations and 77 process mining experts contributed
to it. The active contributions from end-users, tool vendors,
consultants, a n a lyst s, and researchers illustrate the growing
relevance of process mining as a bridge between data mining
and business process modeling. This paper summarizes the
manifesto an d explains why process mining is a highly rele-
vant, but also very challenging, research area. This way we
hope to stimulate the broader ACM SIGKDD community
to look at process-centric knowledge discovery.
1. PROCESS MINING
Process minin g is a relatively young research discipline that
sits between computational intelligence and data mining on
the one hand, and process modeling and analysis on the
other hand. The idea of process mining is to discover, mon-
itor and improve real processes (i.e., not assumed processes)
by extracting knowledge from event logs readily available
in to d ay’s (information) systems [1]. Pro c ess mining in-
cludes (automated) process discovery (i.e., extracting pro-
cess models from an event log), conformance checking (i.e.,
monitoring deviations by comparing model and log), social
network/org a n iz a t io n a l mining, automated construction of
simulation models, model exten sio n , model repair, case pre-
diction, and history-based recommendatio n s.
Figure 1 illustra t es the scope of process mining. Starting
point for process mining is an event log. All process minin g
techniques assume that it is possible to sequentially record
events such that each event refers to an activity (i.e., a well-
defined st ep in some process) and is related to a particular
case (i.e., a p rocess instance). Event logs may store ad-
ditional information about events. In fact, whenever possi-
ble, process mining techniques use extra information such as
the resource (i.e., person or device) executing or initiating
the activity, the timestamp of the event, or data elements
recorded with the event (e.g., the size of an order).
Event logs can be used to conduct three types of process
mining [1 ; 2]. The first type of process mining is discov-
ery. A discovery technique takes an event log and produces
a model without using any a-priori information. Process
discovery is the most prominent process mining technique.
For many organizations it is surprising to see that existing
techniques are indeed able to discover real processes merely
based on example executions in event logs. The second type
of process minin g is conformance. Here, an existing process
mo d el is compared with an event log of t h e same process.
Conformance checking can be used to check if reality, as
recorded in the lo g , con fo rms to the model an d vice versa.
The third type of process mining is enhancement. Here, the
idea is to extend or improve an existing process model using
information about the actual process recorded in some event
log. Whereas conformance checking measures the alignment
between model and reality, this third type of process mining
aims at changing or extending the a-priori model. For in-
stance, by using timestamps in the event log one can extend
the model to show bottlenecks, service levels, throughput
times, and frequencies.
Figure 1 shows how first an end-to-end process model is dis-
covered. The model is visualized as a BPMN (Business Pro-
cess Modeling Notation) model, but internally algorithms
are often using more formal notations such as Petri nets,
C-nets, and transition systems [1]. By replaying the event
log on the model it is possible to add information on bottle-
necks, decisions, roles, and resources.
2. IEEE TASK FORCE ON PROCESS MIN-
ING
The growing interest in log-based process analysis motivated
the establishment of the IEEE Task Force on Process Min-
ing. The go a l o f this task force is to promote the research,
development, education, and und erst a n d in g of process min-
ing. The t a sk force was establish ed in 2009 in the context of
the Data Mining Technical Committee of the Computational
Intelligence Society of the IEEE. Members of the task force
include representatives of more than a dozen commercial
software vendors (e.g., Pallas Athena, Software AG, Futura
Pro c ess Intelligence, HP, IBM, Fujitsu, Infosys, and Fluxi-
con), ten consultancy firms (e.g., Ga rt n er and Deloitte) and
over twenty universities.
Concrete objectives of the task force are: t o make end-users,
developers, consultants, managers, and researchers aware of
the state-of-the-art in process mining, to promote th e u se
of process mining techniques and tools, to stimulate new
process mining applications, to play a role in standardization
e↵orts for logging event dat a , to organize tutorials, special
sessions, workshops, panels, and to publish articles, books,
videos, and special issues of journals. For example, in 2010
the task force standardized XES (www.xes-standard.org),
a standard logging format that is exten sib le and supported