Recently I helped someone unfamiliar with process mining in starting analysis on a log. One of the things that I noticed is that it is hard to get to know the overall ‘structure’ and meaning of the terms used. This is further complicated by inconsistent use of terminology in conversations and documentation but also in ProM 5.2. In this post I will try to explain some of the most common terms used in process mining and what they (should) mean.
Note: this is not a ‘definite’ list, it is just how I think the terms should be interpreted and used!!! Furthermore, any suggestions and additions are welcome!!!
The overall picture: A system (e.g. a workflow management system) facilitates the processing of cases using a predefined process in which activities and their ordering is defined. The activities executed in this system are recorded in an event log which can be ‘reverse engineered’ using ProM for instance. The log contains actual executions of events on cases on a certain moment in time by a certain actor etc.
The result of this reverse engineering can be a process model describing the behavior recorded in the log but performance -, social network – and constraint analysis is also possible. We won’t go into all the possible analyses in this post.
So, an (event) log contains information about process instances (e.g. cases) and the events that are performed on/for them.
It is also important to understand that there are two levels: one is the conceptual level in which we do not talk about actual instances but generally talk about objects that can appear in a log. The other level is the instance level in which you look at specific instances of process instances, event executions, originators, etc. etc. In the general terms list I tried to indicate whether a term refers to a conceptual aspect or really refers to an (set of) instance.
General Process Mining terms:
(Used in ProM 5.2 and MXML, new terms are used in ProM 6 and the XES event log format)
- Activity An action or task that can be performed for a process instance (conceptual level);
- Data attribute An extra attribute recorded in the MXML file. Examples are the amount of a purchase order or the patient’s age. These attributes can for instance be used for decision analysis in ProM (conceptual level);
- Event This can either refer to an activity or an event instance performed by a resource on a certain time for a specific process instance. The meaning therefore depends on the context in which it is used;
- Event Class Used in the ProM Dashboard, it refers to the number of different activities encountered in the log (instance level).
- Event Log A recording of a set of events, an MXML log is an example of an event log format (instance level);
- Event Instance A recording of an executed event with information such as execution timestamp, event type and originator (instance level);
- Event Type Each activity can be in one of several states. The most commonly used states are ‘start’ and ‘complete’. The meaning is very straightforward: an activity is started and a certain amount of time later it is completed. There are several other event types or states, for a complete overview see figure 4 in the ‘MXML paper’ (PDF) (might be outdated) (conceptual level);
- Log The original log generated by the source system which records things that have happened. In order to be used within ProM this needs to be converted to the MXML format using the ProM Import Framework (instance level).
- Process Instance (PI) The object you are following and on/for which events occur. Examples are cases, patients, machines etc. (can be both conceptual and instance level);
- Process mining Analyzing a business process based on an event log, see http://www.processmining.org;
- ProM An application to apply several process mining techniques to an event log, see http://www.processmining.org. The version at the moment of writing is 5.2 and version 6.0 is under development (nightly builds are available);
- ProM Import Framework A framework for converting event logs to the MXML event log format. A set of converters for common formats is available but new converters can be programmed in Java;
- Model Element Used in the ProM Dashboard Summary, it should be interpreted as ‘activity’.
- MXML A meta model for event logs. An event log needs to be in this XML format to be processed by ProM. More information can be found in the ‘meta model for process mining’ paper (PDF) (conceptual level);
- MXML log The actual MXML file with all the recordings following the MXML format (instance level);
- Resource Any actor that can execute an activity, for example humans, the system itself or a web service (conceptual level);
- Timestamp A time indication consisting of a date and possibly a time part (instance level);
Well, that’s the list for now. I hope I helped someone and did not add to the confusion. If you have any questions, suggestions or additions, please post a comment!!! Especially the ‘conceptual v.s. instance’ part was hard for me to explain so any improvements are welcome.
– Joos –
P.S. @my supervisor: I created this article in the weekend and scheduled it for publication on Tuesday, so don’t think I’m procrastinating
This article was first published on Blog of Joos Buijs: Process Mining Terms: A Small Glossary.