The last couple of years the word is on “Self-Service” BI. Almost all BI-vendors are feeling the need to tell the world that their tool is a perfect match for “Self-Service”. Finally the business users don’t need IT anymore to analyze data and create reports, etc. Even nowadays with the possibility to explore all data available in data lakes (to use another buzzword). In this post I will explain why I do not agree with the promise of “Self-Service” vendors and SI’s try to sell as a concept to their customers.
When I started in BI in 1999 tool vendors and SI’s, including myself, where fighting against users with MS Excel and MS Access creating their own reports and analyze their data. For this they asked for direct ODBC access or rather “a connection” to the different source systems and work their way thru the data and try to integrate within different spreadsheets with lookups and such. Apart from being time consuming and labor intensive it was also error prone. But it was also very agile and direct for the users themselves. Of course they still needed help them combining the data and also understanding the data. And sometimes they indeed ask for that support.
The major reason for the fight against this type of “Self-Service BI” was that this was considered dangerous. Users could change or manipulate the data or create calculations without any governance, on purpose or just out of ignorance. The answer from vendors and IT-department was: use proper BI tools like Cognos, BO, Microstrategy, etc. Tool vendors, SI’s, myself included, embraced the possibility to deliver reports in PDF to restrict the ‘misuse’ of data.
We tried to create the proper reports and dashboards for all users and include as much filters as requested to make the users happy. In the end users still asked for the “export to XLS” button plus asked for filtered reports with lots of data in it at a lowest possible level of grain and a maximum of X number of rows. Of course this was their replacement of the direct ODBC access to the source systems which they no longer had.
For me this already changed my opinion in the restriction of users in accessing the (their own) data. I’d rather gave them the freedom on the governed data out of data marts instead of let them access the source systems again. I like to think this was also in the users interest, the datamarts made more sense then the source systems.
The need of users to control their data and have unlimited access was, and always will be, big. This is also heard by new and upcoming tool vendors like Qlik and later on Tableau and such. These claimed to be easy to use tools which provide the promise of full control by the user (according to Qlik in the early days you don’t need a data warehouse anymore!). For me it led to a discussion on an event in Amsterdam opposed to some tool vendors where I claimed that I don’t mind what tool a user is using to access their data. From my side this could even be Excel! Of course all the other people in the panel were not amused and even offended. How could I suggest to let users use the darkest of all tools! No control on presented data is the worst thing. The only thing is that they did not listen to what I suggested: Have a governed and trusted set of data available which users can freely access and explore without any restriction on tooling. Of course, but that is governance as well, when there is a meeting or presentation the data is presented in a formal way with a standard tool the whole organization is using . This is to prevent endless discussions about calculations and definitions. This is the architecture I want to create for my users .
So for me there is no such thing as true and complete Self-Service BI for the entire organization. There are a limited number of users capable of doing it all on their own and still got the right data / information on the right time for the actions to be made. These are however a minority if they ever exist in an organization.
In my opinion there will always be a governed set of data – let’s call these the data marts, virtual or physical, in any or non modeling pattern – which users can freely access to explore and analyze. Next to a fixed set of reports, dashboards, etc. which have a formal status in the organization with agreed definitions and use (which can be based upon the same datasets). This is what I call Managed-Service, the data is already combined/integrated/curated and ready for the users to explore. Which tool they are using for it and if they want to combine it with other data it is to them to decide. However in my experience the only true “Self-Service” tool is Excel, a tool which users feel free to use and explore data the way they want it. Tableau. Qlik, PowerBI and all others have a too big technology component which scares users. Except of course the power users, or those people on the department which are feeling more happy to work with more advanced tools and in fact are the BI people within each department. These are also the people you want in your “Center of Excellence” or at least help your team to create reports and analysis to be used in the entire organization.
I like to thank Ronald Damhof who brought up his 4 Quadrant model (see: KeynoteÂ Â 22 mei 2014 – dwh automation – 4 Quadrant) . For me that is a guidance when I tell people my thoughts on “Self-Service”. People are free to use and explore data as long as they are aware that this is a non (less) governed area or Quadrant IV even if the data itself comes directly out of the data warehouse or Quadrant I. Also thanks to Martijn Imrich in his post on Data Lab (https://www.linkedin.com/pulse/what-data-lab-martijn-imrich) where he is suggesting to make the data full available for exploration where he is focusing on Quadrant I, III & IV of the Quadrant model and thus extending governed data with non-governed data.
In my Managed-Service thoughts the Data Lab is consisting of a governed set of data which the data scientist can and probably will extend with extra data (external or other data marts or other sources or all) to explore and discover useful possibilities for their organizations. A part of this data is still managed and the group of users are highly skilled and trained to work with tools and data.