Business Intelligence Semantic Model – The Good, The Bad, and the Ugly

UPDATE 11/14/2010

This blog probably will be the most updated blog I’ve ever written. I have to admit that my initial reaction to BISM was negative. For the most part, this was a result of my disappointment that Microsoft switched focus from UDM to BISM in Denali and my limited knowledge of the BISM vision. After SQL PASS, I exchanged plenty of e-mails and Microsoft was patient enough to address them and disclosed more details. Their answers helped me to “get it” and see the BISM big picture through more optimistic lenses.

 

UPDATE 05/21/2011

Having heard the feedback from the community, Microsoft announced at TechEd 2011 that Crescent will support OLAP cubes as data sources. This warrants removing the Ugly part from this blog.

 

After the WOW announcement at SQL PASS about PowerPivot going corporate under a new name, Business Intelligence Semantic Model (BISM), there were a lot of questions from the SSAS community. Your humble correspondent did his own share of nagging. For some obscure reason, this time Microsoft decided to keep MVPs and community leaders in the dark until PASS, so I was as unprepared as the rest of the community about what is to come. Prior to PASS, the SSAS team told me that corporate BI would be a major focus in Denali, which I interpreted as enhancements to UDM, only to find out that the entire buzz is about BISM…what a thunder from a clear sky. To its credit, the SSAS team was quick in its attempt to cover the collateral damage as we can see in the posts below from Amir Netz and T.K. Anand.

http://cwebbbi.wordpress.com/2010/11/11/pass-summit-day-2/#comment-1498

http://blogs.technet.com/b/dataplatforminsider/archive/2010/11/12/analysis-services-roadmap-for-sql-server-denali-and-beyond.aspx

I decided to put down the gist about what I’ve learned about BISM so I have a ready answer for questions that I am sure I will be asked over and over. Let’s start with the WHY.

Why BISM?

While SSAS is now an undisputed leader in the OLAP space and the BI platform of choice, Microsoft is facing competitive pressures. As you know, new vendors and technologies are popping up like daisies every day in the fast-changing BI landscape attacking on all fronts – QlikView, Tableau, Business Object Universe, to name a few, and Microsoft cannot rest on its laurels. The main selling point of these companies is that OLAP is too complex. I’ve seen their marketing campaigns in action and I know how persuasive they could be. One interesting thing I’ve noticed though is that the sleek demos I’ve seen all used star schemas, where data is neatly organized and stored in a set of dimension and fact tables. Creating this schema and populating it usually take about 60-80% or the development effort for implementing a BI solution.

So, if the star schema is not obsolete, what’s so appealing then in the offerings of these vendors? The notion that the OLAP layer is too complex and you don’t need it. This is a clever strategy (and one of the laws of power) – If you can’t have it (OLAP market share) than despise it J Instead, these companies build their analytical layer as a thin wrapper directly on top of the star schema, which is exactly what PowerPivot does. This abstracts IT from knowing all these intimidating OLAP concepts, such as cubes, dimensions, measures, and the most complex of them all – MDX. This is a perceived simplification though because it’s unlikely to meet more demanding requirements. For example, I wouldn’t have been successful implementing BI financial solutions if I didn’t have UDM features, such as scope assignments, parent-child dimensions, and many-to-many relationships.

While I don’t personally believe that UDM (the SSAS implementation of the OLAP layer) is too complex, I do think that it carries a lot of baggage partially due to its deep roots in traditional OLAP and evolution over years. Let’s name a few:

  1. Cubes – The cube as a core OLAP concept that became somewhat outdated after moving to the UDM which is an attribute-based model. Ever wondered if you should go for one cube or multiple cubes? This is one of the main dilemmas when starting a new project.
  2. Detail-level reporting – This is where UDM over-promised but under-delivered. For example, UDM doesn’t support text-based measure. Besides actions, the clumsy workarounds are action or degenerate dimensions, with the latter being what the “classic” OLAP recommends.
  3. Storage complexities – the need to process the cube, partitioning, aggregations, etc. Moving to BISM, you still need to load the data into VertiPaq (unless you do real-time). So there is still “processing” involved but there is no need for aggregations or “index building” anymore. Partitions are still there – but they are needed for data management purposes (incremental updates, rolling off data etc) and not for performance reasons.
  4. Data latency – the UDM proactive caching attempts to address it but it’s far from ideal.

Again, I don’t list complexity because I don’t think UDM is very complex. Or if it is, that’s because business requirements are and I sure don’t want BISM to have less features that will prevent us meeting such requirements. So, the SSAS team had a dilemma: should they continue enhancing UDM or go back to the drawing board. Microsoft realized that the UDM internal architecture and storage model have become increasingly complex over the years, thus barring major changes and growth. Instead of continue building on the old foundation, they’ve decided to start from scratch by going with the VertiPaq column-oriented store that debuted in PowerPivot. I have to give credit to the SSAS team for making such a bold and a gut-wrenching decision, which I am sure hasn’t been made lightly.

The Good

BISM is essentially a fresh start for Microsoft in the BI space. It will attempt to simplify OLAP and make BI even more accessible. As many of the BI professionals agree, based on what we know and seen in PowerPivot, BISM has a huge potential and will bring welcome enhancements, such as:

  1. Schema simplification – no need to define explicit cubes, dimensions, measures, and thus eliminate the perceived complexity of implementing an OLAP solution. In fact, expect the term “OLAP” to be heavily deemphasized in years to come.
  2. Blazing performance – this will probably void partitions and aggregations.
  3. Flexibility – There will be no distinction between measures and dimensions. Every attribute can be used for aggregating and slicing. This is one PowerPivot feature that I like the most.
  4. Detail-level reporting without hacks.
  5. DAX expression-based language that removes some of the complexity of MDX.
  6. Possibility of real-time data access – BISM running in VertiPaq mode will deliver the best performance and it is similar to MOLAP. However, data latency will also be improved if you work in real-time (like ROLAP) against a SQL database that has column store indexes and you don’t query massive data volumes. It should be fast, but still slower than working directly against the BISM VertiPaq.

The Bad

In Denali, BISM will not be as feature-rich as UDM and it probably won’t be for years to come. The lack of more advanced features (we don’t have a feature comparison list yet), will probably make organizations and consultants favor UDM in the interim, which is still the “bread and butter” for MS OLAP. I personally won’t go for a new model knowing that its limitations can bite me in a long run or a model that puts me back a decade ago no matter how simple and fast it is. As you know, all projects start simple until a requirement that changes everything.

So, why didn’t Microsoft left BISM in the self-service BI area to marinate for a while as PowerPivot?

  1. BISM will let Microsoft levels the playing field against the various enterprise BI semantic models
  2. Remove Excel hosting limitations, such the 2 GB size limitation an Excel workbook has. In reality, 2 GB translates to many more gigabytes of uncompressed data but still not enough for corporate data volumes.
  3. Microsoft can now position BISM as a BI platform for both reporting and analytics without abandonment of relational concepts in favor of “weird” OLAP terminology. This is something that mainstream IT will find very attractive.
  4. BISM will provide the continuum of Self-service BI -Corporate BI on a single platform. You can start with self-service BI with PowerPivot and migrate it to a corporate solution.

Unfortunately, since BISM will not get all UDM features at least in Denali, in the interim we will have to make a choice between UDM and BISM, which essentially will be a compromise between features and ease of use. In years, as BISM catches up with UDM, I’d expect the choice to become easier. Meanwhile, BISM will not be as advanced as the UDM and for more advanced requirements UDM is your only choice.

That said, even in its first release, you may find BISM much more advanced and powerful than competing technologies, such as BO Universe and similar metadata storage models. So, when evaluating the BISM capabilities, it is important to pay attention to your reference point. The UDM is a very high bar. However, if you compare it against Universe, Qlikview, etc., you will probably find BISM superior.

The Ugly

Crescent (the new SSRS ad-hoc reporting tool) will support BISM only. Frankly, I don’t know what was the thinking here especially given that UDM is still the “bread and butter” of OLAP (see the links above). The message that everyone would get is that UDM has fallen out of favor which is apparently not the case. I strongly encourage Microsoft to support UDM in Denali even if it won’t make it in the box (no a big surprise there, Report Builder 2.0 has done it and hell didn’t freeze over). I think Crescent has a lot of promise and since we don’t have a webi browser, it is long due on the BI wish list. There are probably good technical reasons for supporting BISM only but if Excel can do MDX against PowerPivot, so should Crescent. This will also reinforce the message that UDM is still the premium model for years to come while BISM is in the works.

So?

BISM has a bright future and will be a winner if it delivers on its promise but doesn’t throw the baby with the bathwater:

  1. Simplifies OLAP
  2. Preserves UDM flexibility and features we have today, especially the ones that came with SSAS 2005, such as flexible dimension relationships and scope assignments. 
  3. Supports both DAX and MDX as query and expression languages to preserve investment developers and vendors have made throughout all these years.

Do I feel that my consulting business will be negatively affected by BISM should it become super user-friendly so even my mom can use it? I don’t think so. Here is where the C++/C# analogy could be helpful. Those C++ developers who took the time to learn C#, which for them was a walk in the park, found that it opened new opportunities. And again, no matter how simple the tool, the requirements are what makes things complex and I don’t think BISM will read minds any time soon. Sure, sometimes an old dog must learn new tricks but that’s why we love it …

May you live in interesting times!
A Chinese proverb