From relational database to valuable event logs for process mining – A procedure
The huge potential of process mining applications is -luckily- already discovered in a variety of business settings. In industry, more and more companies are learning about its potential value. In meanwhile, academic researchers continue their quest to the best algorithm, the most meaningful metrics, the most understandable visualisations, etcetera. Whatever ‘best’, ‘meaningful’, and ‘understandable’ may be… These are food for thought and discussion on their own. But I’d like to address a different mini-research-topic-on-its-own: the event log.
An implicit assumption in process mining (both research and applications), is the existence of an event log.
The event log can be generally described as ‘A collection of events. An event 1) refers to a specific activity, 2) that took place at a certain moment in time, and 3) can be assigned to a unique case’.
Companies indeed often have this type of information. Yet, this is in many cases hidden in a database system, and does not present itself in a ready-to-go format. Hence the challenge of restructuring this data into an event log. To reach a valuable event log, a thorough understanding of the event log structure and the link to the original data structure is necessary.
When teaching my students on how to build a valuable event log, pointing them to all different decisions and options they have to bear in mind, I noticed the trial-and-error learning curve and the lack of proper teaching material. Building a decent event log came across like a set of arbitrary decisions and they could not (yet) grasp the consequences of these decisions.
Thinking back of how I learned the ‘art of building event logs’, it struck me that I started almost 10 years ago with my first event log building project (in 2007). Actually, the struggles of my students now coincide with the struggles of companies I had the privilege to work with. Over the years, I guided different companies on their journey into the world of process mining. Along this journey, I too had my fair share of trial-and-error.
So this is the outcome: I wrote a procedure to build valuable event logs from data that is stored in relational databases. The procedure is the result of distilling and structuring my experiences I had over the years. It is not based on academic research, but has to be seen as a report of ‘lessons learned’, poured into a manual. The aim is twofold:
- to help novice process (mining) analysts taking those first steps on the learning curve, and
- to create awareness of the importance of a well-thought-out event log structure.
I hope I can succeed in both aims. This leaves me only one more thing to say: happy event log building!
Mieke Jans
Attachments
Source: Research Group Business Informatics @ UHasselt