Blog of Joos Buijs

About personal things, process mining and the rest in life.

Archive for November 2009

Funny process model week 47: When to meet with your advisor?

leave a comment »

I’m starting to notice that my ‘Funny process model’ series is not so much about process models any more… Hmz, maybe process models are not meant to be funny??? Well, here is a funny PhD comic then:

My meeting is scheduled for Friday morning every other week. The ‘Pro’ is true, the ‘con’ not, or not so far at least.

– Joos –

Advertisements

Written by Joos Buijs

November 20, 2009 at 17:00

Process Mining Terms: A Small Glossary

leave a comment »

Recently I helped someone unfamiliar with process mining in starting analysis on a log. One of the things that I noticed is that it is hard to get to know the overall ‘structure’ and meaning of the terms used. This is further complicated by inconsistent use of terminology in conversations and documentation but also in ProM 5.2. In this post I will try to explain some of the most common terms used in process mining and what they (should) mean.

Note: this is not a ‘definite’ list, it is just how I think the terms should be interpreted and used!!! Furthermore, any suggestions and additions are welcome!!!

The overall picture: A system (e.g. a workflow management system) facilitates the processing of cases using a predefined process in which activities and their ordering is defined. The activities executed in this system are recorded in an event log which can be ‘reverse engineered’ using ProM for instance. The log contains actual executions of events on cases on a certain moment in time by a certain actor etc.
The result of this reverse engineering can be a process model describing the behavior recorded in the log but performance -, social network – and constraint analysis is also possible. We won’t go into all the possible analyses in this post.

So, an (event) log contains information about process instances (e.g. cases) and the events that are performed on/for them.

It is also important to understand that there are two levels: one is the conceptual level in which we do not talk about actual instances but generally talk about objects that can appear in a log. The other level is the instance level in which you look at specific instances of process instances, event executions, originators, etc. etc. In the general terms list I tried to indicate whether a term refers to a conceptual aspect or really refers to an (set of) instance.

General Process Mining terms:

(Used in ProM 5.2 and MXML, new terms are used in ProM 6 and the XES event log format)

  • Activity An action or task that can be performed for a process instance (conceptual level);
  • Data attribute An extra attribute recorded in the MXML file. Examples are the amount of a purchase order or the patient’s age. These attributes can for instance be used for decision analysis in ProM (conceptual level);
  • Event This can either refer to an activity or an event instance performed by a resource on a certain time for a specific process instance. The meaning therefore depends on the context in which it is used;
  • Event Class Used in the ProM Dashboard, it refers to the number of different activities encountered in the log (instance level).
  • Event Log A recording of a set of events, an MXML log is an example of an event log format (instance level);
  • Event Instance A recording of an executed event with information such as execution timestamp, event type and originator (instance level);
  • Event Type Each activity can be in one of several states. The most commonly used states are ‘start’ and ‘complete’. The meaning is very straightforward: an activity is started and a certain amount of time later it is completed. There are several other event types or states, for a complete overview see figure 4 in the ‘MXML paper’ (PDF) (might be outdated) (conceptual level);
  • Log The original log generated by the source system which records things that have happened. In order to be used within ProM this needs to be converted to the MXML format using the ProM Import Framework (instance level).
  • Process Instance (PI) The object you are following and on/for which events occur. Examples are cases, patients, machines etc. (can be both conceptual and instance level);
  • Process mining Analyzing a business process based on an event log, see http://www.processmining.org;
  • ProM An application to apply several process mining techniques to an event log, see http://www.processmining.org. The version at the moment of writing is 5.2 and version 6.0 is under development (nightly builds are available);
  • ProM Import Framework A framework for converting event logs to the MXML event log format. A set of converters for common formats is available but new converters can be programmed in Java;
  • Model Element Used in the ProM Dashboard Summary, it should be interpreted as ‘activity’.
  • MXML A meta model for event logs. An event log needs to be in this XML format to be processed by ProM. More information can be found in the ‘meta model for process mining’ paper (PDF) (conceptual level);
  • MXML log The actual MXML file with all the recordings following the MXML format (instance level);
  • Resource Any actor that can execute an activity, for example humans, the system itself or a web service (conceptual level);
  • Timestamp A time indication consisting of a date and possibly a time part (instance level);

Well, that’s the list for now. I hope I helped someone and did not add to the confusion. If you have any questions, suggestions or additions, please post a comment!!! Especially the ‘conceptual v.s. instance’ part was hard for me to explain so any improvements are welcome.

– Joos –

P.S. @my supervisor: I created this article in the weekend and scheduled it for publication on Tuesday, so don’t think I’m procrastinating 🙂

Activity An action or task that can be performed for a process instance.;

Written by Joos Buijs

November 17, 2009 at 17:00

Posted in Process Mining

Tagged with , , ,

Funny process model week 46: Economic crisis impact in university staff

leave a comment »

Well, strengthened by the comment of Anne on my last ‘funny process model of the week’ I dare to post this one from PHD Comics:

My only hope now is that Anne is right or that my supervisor does not read my blog (before March 1).

Enjoy the weekend!!!

– Joos –

P.S. My holiday was superb!!! Nice cottage, nice environment and nice girlfriend 😉

Written by Joos Buijs

November 13, 2009 at 22:00

About my master’s project: more concrete

leave a comment »

Since my introduction in my last post about my master project might be a little vague and general I thought that it might be a good idea to provide you with the user interface sketches I made a couple of weeks ago. They are not final of course and some details have changed in the mean time but the main idea stays the same.

Two example screens of my future application at work are:

Mapping application just started with only the log and trace elements.

This screen shows the editor after project creation with only the basic log and trace elements (which can not be added or removed).

Mapping Editor, with events

This screen shows the editor with some events and properties defined.

It might be good to note that you should view the mapping at the ‘meta’ level. This means that we do not define the event instances themselves but we define where to find events for the traces (for which we also defined how to retrieve them). To complicate things further, we might not need to specify each event type (e.g. “Create order”, “pay order”) separately but if we have some kind of event log as input, the event name (or type or WFMElt or what you call it) could be stored in the data source. Then you might only specify one event mapping which retrieves all the event instances from the data source.

The second screen for instance shows the definition of the “Create” event which can be found in the table “Order”. The username and timestamp values can be extracted from the data source as defined in this example. This event would be added to each trace that we can extract according to the trace mapping definition.

Additionally, there are some mapping properties needed for execution. These are the ‘default’ entity to use, how to link two entities together (if not defined in the data source) and a possible selection criteria for traces and/or events. Furthermore, the trace needs to have a unique identifier defined so we can connect events to traces.

The log, trace, event and attribute terms are re-used from the XES definition and the whole mapping definition quite closely follows this event log meta format (where this mapping is another meta level higher I suppose).

Well, I know that this might still sound rather vague but I hope at least less vague then in my previous post.

If you have any questions, please ask!

Joos

PS: I’m actually on holiday this week (this is a scheduled post) so I might not reply before November 16 (2009).

Edit 13-11-2009 21:15: Improved some text, ‘fixed’ the images and added tags to post. Memo to self: never create a post 2 minutes before you leave for a holiday 😉

Written by Joos Buijs

November 10, 2009 at 17:00

Funny process model week 45: The Scientific Method

with 2 comments

Well, this ‘process model’ is rather tricky to post because without the proper explanation it might appear that I or we (here at the TU/e or my faculty) work this way. I must say, that this is not the case to my knowledge. The proposed process model is just a hypothesis of how reality could look like and needs to be tested…

Or how it works in reality...

Or how it works in reality...

Hope you don’t feel offended 😉

Joos

Written by Joos Buijs

November 6, 2009 at 17:00

About my master project

with one comment

To make sure that this blog won’t be about funny process models alone it might be a good idea to introduce and explain my master’s project subject: Mapping Data Sources to XES in a Generic Way. Let’s dissect this rather vague sentence to explain what it is all about:

Mapping: in this context it means to let the user define a way to map one data source to another.

Data Sources: most people might think of databases first but text files, XML files and even web services can be considered as data sources. Although it must be seen how many data sources we are able to support, we intend to at least support the common database formats and the CSV (comma separated values) plain text format.

XES: the one requiring the most explanation. Although pronounced (as ‘excess’) similar to one of the well known database formats from a well known software vendor, it means something completely different. In this case we refer to the ‘Extensible Event Stream’ format. This format is an extend-able event log format for, well, storing event logs. For people familiar with the MXML format: XES is the new and improved MXML! For people unfamiliar with MXML: visit processmining.org (more specifically, read an informal introduction to MXML (.PPT, 0,9 MB) or read the more formal MXML introduction paper (PDF, 130 KB)). The XES meta-model is implemented in the OpenXES Java library, more information about XES can also be found there.

a Generic Way: of course, we want our application to be applicable in many situations and therefore it must be generic.

So, in brief, the goal of my project is to develop an application that allows a user to define a mapping from a (set of) data source(s) to the XES event log format and to execute this mapping. Resulting in an event log format that can be used for process mining with (the new version of) ProM.

If you have done one or several process mining project you must know that preparing the data is one of the most time consuming (and, in my opinion, most annoying) part of the process mining project. This master project aims at providing an application that will allow you, the process miner, maybe together with the domain expert, to define a mapping from the data source(s) to the XES event log format without the need to write (Java) code.

Any questions, comments, feature requests etc. etc. are more than welcome!!!

See you at the next post! (Which will either be a ‘funny process model’ or a post showing some GUI designs for the application)

Joos

Written by Joos Buijs

November 3, 2009 at 17:00