Back in 2007 when I first started working with ETL processes and building Data warehouses, it was a steep learning path. Setting up a Data Warehouse (DWH) required plenty of planning and project management before the actual development could start. There were various steps that lead to a functional DWH capable of servicing the business needs.
To name a few
- Create Source to Target Mappings including business rules and logic.
- Create a naming convention to be used throughout the DWH.
- Document the business logic as you go on building the DWH.
- Generate and store Surrogate Keys.
- Decide on what goes in SCD 1 and SCD 2.
- Design the architecture, whether it would be Star or Snowflake or just some form of denormalized data.
- Deciding on the load frequency.
- Finally loading it into target data model.
It is estimated that in a normal Business Intelligence (BI) project, close to 80% of the time and budget is spent on setting up the DWH. It is important to understand here that a solid DWH architecture and design sets up your entire BI project for success. Any critical failures or misunderstandings in the design and architecture of DWH can have serious business consequences. Considering these factors automating your DWH implementation is a step that every company would like to invest in.
I have spent some of the quiet Christmas nights in front of a burning fireplace reading articles about the future of the data warehouse. There are many opinions and arguments on how the future of data warehousing will be.
Two types of professionals are argumenting the pros and the cons in regards of technical architecture surrounding your data warehouse solution.
The technocrats make strong arguments in regards of specific technologies that will solve your challenges in the new data area.
On the other hand, you have professionals that are more of the old school and are skeptical to let technology drive the type of challenge you are going to solve. Let’s call them the conservatives.
Being a conservative myself it’s easy to point out what the technocrats do wrong, but are they as wrong as someone as me at times argue?
Being in the software business, one of our most important tasks is to let our customers know about our product. One of the ways we do that is to attend various conferences around the world. For two consecutive weeks in November we will be spend a lot of time on the conference carpet, in sessions, and in our hotel rooms.
First out is the PASS Summit in Seattle. This is the Microsoft SQL Server user based conference with a lot of great topics on both traditional SQL Server and data warehousing, but probably more topics on the new architecture and the new possibilities in Azure.
On August 10th IKT-Norge, Visma and Rambøll presented their 10th Annual report called “IT i Praksis”. It’s a survey meant to look at the maturity of practical IT use mainly in the public sector of Norway. It’s a huge report and I won’t cover all the topics here but there were a couple of things that stood out. The survey really boils down to the word digitalization. How can the public sector move faster so their users get the best digital experience possible?
This isn’t just a challenge for the public sector. It’s the same challenges if you are a retailer, a bank or really any other business where you need to connect with users in a digital way. There is one significant difference though, the ability to organize and the funding.
(I’m a poet and I know it)
I have been spending a lot of time warning people about the pitfalls, or rather deep ends, of the Data Lakes and Big Data initiatives. First, I want to state that I think the Data Lake as a concept is a great idea, but it needs to co-exist with your more traditional data architecture.
So, what I’m discussing in this post is my take on the total data architecture as I see it in today’s data age.
We all know the data warehouse mantra “One single version of the truth” or the variance “One single source of the truth”. The single source statement will be even less relevant with the coming of Big Data. But the one single version of the truth is probably even more important now in the big data age. You need one place to implement your business rules, and place to contain the truth. What we see now is that the place for holding the truth is not always in the same place.
In my previous posts, I have talked a lot about preparing your data and not forgetting your structured data in this time of Big Data, IOT and other cool stuff. Today I thought I’d take a step back and try to explain what our company do when it comes to preparing and structuring data and discuss how this can be a viable option for you.
So, what is Xpert BI?
We call it Self-Service data preparation or data warehouse automation. So, what does that mean and why should you do this instead of doing traditional ETL or ELT?
I just got back from a long and exiting trip to London, where BI Builders has been sponsoring two big events. First Big Data World, where I was a speaker and then the Gartner Analytics Summit, where my colleague Anja was a speaker. And I would like to share my thoughts and key takeaways from this inspiring trip.
So firstly, Big Data World, a huge conference that was co-located with four other conferences; Smart IOT, Cloud Security Expo, Cloud Expo and Data Centre World.
As you probably can imagine the buzz words flew high here, Big Data vendors in all shapes and sizes had a booth and it was amazing to both hear about and get demonstrations of how cloud based, non-structured data could be utilized. There are a lot of upcoming smart software and technics that will help us utilise our data in new and exciting ways.
Last night I was attending an after work meet up where the topic was “Clash Of The Titans”. Microsoft, IBM, SAP and Oracle was presenting their BI and analytic solutions, both what they can offer today and how their future releases will be.
Remembering back to the year 2000 when I started my first job as a data warehouse developer, we programmed SAS code without any intellisense on a dark blue background with a white font.
After writing 500 lines of code we said a small prayer and then we pressed F8. Usually we got lucky, other times we didn’t, and had to use the rest of the day finding that small typo or looking for that breach in logic that didn’t give us the result we wanted.
The buzzwords and the endless possibilities that can be made in the cloud seems to be on everybody’s minds these days. Did we forget our core business?
At the breakfast table at the TDWI conference in February 2016 I met two gentlemen and we started talking about Big Data, cloud BI and the other buzzwords. They told me a story from their company.
“One day my boss comes into my office and asked, how are we on big data? I had to ask him back, what do you mean? My boss asked again, how are we on big data? Do we have Hadoop? And I said No? Well, we need that, my boss replied and walked out the door. So now we have a Hadoop cluster in the cloud with little or no data and we really don’t know why we need it, but I’m sure my boss got his bonus!”