Self-Service Data Preparation, why would you do that?

Self-Service Data Preparation, why would you do that?

In my previous posts, I have talked a lot about preparing your data and not forgetting your structured data in this time of Big Data, IOT and other cool stuff. Today I thought I’d take a step back and try to explain what our company do when it comes to preparing and structuring data and discuss how this can be a viable option for you.

So, what is Xpert BI?

We call it Self-Service data preparation or data warehouse automation. So, what does that mean and why should you do this instead of doing traditional ETL or ELT? A tool is only a tool

Our CTO and the brain behind Xpert BI, Erik Frafjord, had an idea that the process of getting to your data prepped and ready for reporting needed to be optimized, but not at the cost of quality, governance and documentation. In traditional data warehouse projects 80% of your time is spent preparing your data, and only 20% of the effort is spent on reports, analytics and distribution of the value your data add. You should try to switch that and make 20% go into the preparation and use your 80% on the decision-making process.

Using Xpert BI you always start with your metadata. The first step is to identify and understand your data sources. This being a flat file, or thousands of tables in your SAP installation. For a lot of the big ERP vendors like SAP or Axapta Xpert BI already have predefined application adapters where system specific metadata is combined with the platform specific metadata. This automatically enriches your data model with friendly-names and relationships to fully understand your system implementation.

Depending on the quality and availability of metadata, it is not unusual to include application owners or experts to enhance the source system data model. Doing this saves you a lot of time afterwards when you are going to use the data for further transformation processes or direct in analytics.

This understandable layer is what we call the idealization layer, here you have a “copy” of your source fully enriched with understandable data on the same level as your source. It is an ODS, but it is a lot more useful and understandable.

SAP before idealisationSAP data source after idealisation

(Example of idealized data from SAP)

Business logic, denormalizations, integrations and transformations of data are done using MS Management Studio using T-SQL. The reason for this is that we didn’t want to reinvent something that works, and by using existing technology the need for training is minimal using Xpert BI. So, joining tables, and applying your business rules you do here, either with the help of a wizard or by writing your own sql code. Our tool manages all data loads by always maintaining table dependencies can easily configure commonly used load options such as incremental loads, filters, surrogate keys and snapshots and SCD of any given dimension attributes.

This means that you can use the methodology of your choosing designing your data warehouse or your DataMart. If you want a straightforward Kimball approach or want to design a Data Vault you are free to choose the architectural design that fits your organization best. Xpert BI can handle any number of databases when it comes to complete technical documentation and dependency control.

We also enrich your solutions with surrogate keys so your joining goes incredibly fast. You also get the complete lineage of your solution as your technical documentation. This lineage is also used for knowing in which order your tables needs to be loaded, so you never should worry about the sequence you are running your jobs. The tool will automatically optimize the parallelization and sequencing of data loads.Dont be dependent on one developer

In a technical perspective, this sounds and works amazing, but it’s not only in the technical picture you will get a win. What you do here is to force your developers into a more governing way of
doing the development.  You take away their possibility to get creative. Being a musician in my spare time it kind of hurts a little to say this, I have always encouraged the people working for me to be creative. But in this part of the process you need a governed process so anyone can read and continue to develop where someone left off. You shouldn’t be dependent on that one developer that made the original package to do the changes.

So, who would benefit from this?

The sales pitch would of course say everyone. Luckily I am not a salesperson. Let’s try to refine and identify when you should think about an automation tool like Xpert BI.

On a high level, there are three scenarios that would apply.

  1. You are starting a new data warehouse initiative.
  2. You are rebuilding or starting over your data warehouse initiative.
  3. Your data warehouse initiative is not giving you the full picture or its just updating too slow.

If don’t have a data warehouse yet you should consider the automated way. This is probably the most obvious scenario where you should explore the automated opportunity.

The second scenario is much the same as the first, but here you usually have an advantage that some of your business rules are defined, and you only need to revise them. You also most likely must change your architecture because you have an architecture that has been disrupted, or you inherited the architecture from your predecessor.

The third scenario we could call the pain scenario. Here the organization is most likely experiencing that the reports doesn’t give them the answers they are looking for, the wrong answers or no answers at all. There is a lot of reason for a solution to get into this stage, but the main reason is the lack of governance in the development process. This scenario often leads to a rebuild or a start over, or at least it should.

So, having defined the three main scenarios where you should consider an automated approach, does that mean you should? Well you are getting closer to an “Of Course”, but before you rush into buying yet another tool that’s going to magically fix things, look at your organization, is this going to fit our organization?

Do your organization already have a development department staffed with BI developers, you could argue that with strong leadership and good governance they can develop and maintain your data warehouse without the use of a new tool like Xpert BI.

Big organizational changes that would lead to a change in tasks or downsizing, could lead to a disrupted department. Which again could and most likely will lead to a failed project.

I have saying that goes “A tool is only a tool, it’s the people that utilize the tool that make the change, never the tool itself”. So, promote the goal and the way to get there, before you squeeze the elephant into the buss. It might turn out to be a much smaller animal that needs the ride.

Hope I have clarified some of the things that Xpert BI can do for you, it’s not magic, but it is in many cases a lot smarter and a lot faster.