Data analytics is the science of analysing data in order to make inferences. At BI builders we offer a broad range of analytical techniques, each suitable for its own distinctive problem.
- Descriptive analytics helps answer questions about the current status-quo. It usually includes dashboard, reports and different type of KPI’s.
- Diagnostic analytics helps answer questions about why things happened. This may include analytics to discover anomalies, or unknown relations among data elements.
- Predictive analytics helps answer questions about what will happen in the future. Techniques to answer such questions falls in the field of machine learning.
- Prescriptive analytics helps answer questions about what should be done. Prescriptive analytics is what we often call artificial intelligence. It is a kind of machine learning that takes in new data all the time to produce more accurate predictions and more well-defined decisions.
Process of a project
Descriptive, diagnostic, predictive and prescriptive analytics supports and build on each other. Therefore, we advise our clients to start with descriptive analytics and move to predictive and prescriptive analytics as they gain more insight.
While descriptive analysis is quite straightforward to understand, Diagnostic, Predictive and Prescriptive analytics requires more explanation. In the following we will describe three main steps of dealing with these types of analytics:
- Problem definition
- Data exploration
- Validation, implementation and visualization
When defining the problem, it is important to discuss the business reasons for an analytics project. Together with all the stakeholders, we will try to envision what the solution will look like and what it should be capable of. This is usually a mix between technical challenges and analytical challenges as it’s a key step of the analytical process.
The quality of the data used in the model will decide the quality of the result. Here are some important elements to keep in mind:
- Variable identification
Understanding the business process is a key element in identifying the variables to be used in the analytics process. Still, this is a trade-off, as reducing too much the set of variables taken in consideration may improve the model performance but may also hide some “unknown” – but important – relations among fields not obviously related.
- Univariate analysis
The relevant variables are analysed individually by using mathematical methods to identify their distribution. The result of this step is typically visualized by means of boxplots or distributions (in the case of numeric variables) or with bar charts (in the case of categorical variables). Outlier identification is an important aspect of this analysis.
- Bivariate analysis
Bivariate analysis explores possible relationships between variables. Bivariate analysis is perhaps the simplest way to determine to what extent we can predict a value of one variable from another. Uni- and bivariate analysis (often called descriptive or diagnostic statistics) can help to both discover “unknown” relations between data fields and also to reduce the number of variables to be used in the model by identifying “variables that carry out the same information”.
- Missing value treatment
We sometimes need to deal with missing values in the data to avoid a biased model. Imputation techniques is a typical way to handle missing data.
- Feature engineering (variable creation and transformation)
Feature engineering is the science (and art) of extracting more information from existing data. Which variables needs to create and transformed is strictly correlated to the problem definition.
Depending on the problem definition and the results from the data exploration phase we may choose mathematical (often statistical) models with variant levels of complexity; from simple linear regression to machine learning by means of clustering or decision trees.
Validation, implementation and visualization
When the model is refined enough to be used, it needs to be validated on new data to assess it’s ‘usefulness and accuracy’.
In most cases, feature engineering, modeling and validation are an iterative process, where new knowledge gained by the exploration and modeling processes leads to new ideas worth exploring.
Curious to know more on what we have done in the advanced analytics field? Contact us at: firstname.lastname@example.org