CRISP-DM Data Mining Methodology
FastStats is a cost effective marketing data analysis & campaign management solution. It enables users to easily transform data into intelligence, through customer segmentation, marketing data mining, predictive modelling, customer profiling and campaign management.
To help successfully engage in any analytical projects and ensure smooth delivery within the required scope, budget and timescales, Qbase follows an internationally recognised standard data mining process called CRISP-DM (Cross Industry Standard Process for Data Mining).
This process consists of multiple phases which in turn are broken down into sub-phases and tasks. The principal phases are:
- Business Understanding
- Data Understanding, Preparation and Enrichment
- Model Building and Evaluation
- Deployment
Phase 1 – Business Understanding
This initial phase outlines project aims and objectives so that they are clearly defined, understood and documented. This knowledge is then used by the analyst to develop the data mining problem definition and a preliminary plan designed to achieve the objectives.
Phase 2 – Data Understanding, Preparation and Enrichment
This stage typically accounts for over 60% of the total project time with time being spent by the analyst in gaining a full ‘user’ understanding of the data, including its structure, purpose and method of collection. It is also during this phase that a full list and description of available data is collated.
The analyst will also carry out the importing of all the data into the appropriate analytical tool, ensuring it is clean and ready for moving onto the analytical stage(s). This will include the analysis and resolution of missing data plus the overlaying of any data dictionary used to change raw data into a more user friendly format.
Phase 3 – Model Building and Evaluation
In this phase, various modelling techniques may be selected and utilised. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the type and form of the data therefore stepping back to the data preparation phase is often needed.
Once the model has been built it is important to more thoroughly evaluate it before proceeding to final deployment of the model. This will include a review of the steps taken to construct the model to be certain it properly achieves the business objectives as identified in Phase 1. A key objective of the evaluation is to determine if there are some important business issue that have not been sufficiently considered. At the end of this phase, a go/no-go decision on the use of the data mining model or results should be reached.
Phase 4 – Deployment
In the majority of instances the creation of the model is not the end of the project. Even if the purpose of the model is to increase knowledge of the data, the knowledge gained will need to be organised and presented in a way that the organisation can use it.
The deployment phase will look at creating and implementing a repeatable systematic data mining process.
Download Data Strategy & Modelling Examples by clicking here