Posts

OWC to Bite the Dust

Just learned from the David Gainer’s post, that Microsoft has made a decision to retire Office Web Components (OWC) in favor of Excel Services. This is definitely sad news for me. While I agree that Excel Services could fulfill some of the integration scenarios mentioned in the comments for that post, I can think of at least two scenarios where this won’t be the case:



  1. You need a cost effective and easy way to provide OLAP features in your applications. The Office Server pricing model has not been announced yet but it could be intimidating for small companies or application vendors. Not to mention that SharePoint requires Windows Server.

  2. You need to embed an OLAP browser in your Windows Forms application. Currently, I don’t know of a way to embed an Excel spreadsheet on a Windows Form (VSTO, Excel Services, VBA, etc. don’t support this). Not to mention that if browsing cubes is the only thing you need, installing and maintaining the entire Office Server stack may be a difficult proposition.

IMO, it could have been great if the OWC were replaced with a web and Windows Forms OLAP viewers similar to the VS.NET report viewers for SSRS 2005. This would have given developers a light-weight alternative for UDM application integration. Now, we have to wait for third-party vendors to fill in the void. Sigh…

UDM Data Islands

As I mentioned in my TechEd Memoirs post, Microsoft is scaling back from the “super” cube approach in favor of more granular data “islands”. Since this took me by suprise, I approached the SSAS team for clarification. Here are the answers I managed to get:


 


1. What performance benefits could be realized by splitting UDM in data islands?


At this point, we don’t have detailed performance numbers we can share. But to give an indication – a cube that had around 15 measure groups, and a very large number of attributes (over 100 cube dimensions, many with more than 100 attributes) showed a performance improvement of about 6 fold when split into separate cubes (one per measure group), for queries that only referenced a single measure group. The main benefit came from the reduced cube space – the individual cubes only needed to include dimensions related to the measure group, and therefore paid no overhead for unrelated dimensions. There was also some benefit from the MDX script being simpler for each of the individual cubes.


 


2. What guidelines should be followed when partitioning UDM?


The general principle is to have multiple cubes, maybe even one per fact table. Then have other cubes, using linked measure groups that can combine into the complete view. Use the small cubes when they meet the needs, and query the full cube when need to use measures from >1 measure group.   


 


Some general guidelines are:


– If the dimensionality of all the measure groups is (much) the same, it would limit the benefit of splitting up (though there is some benefit of the simpler MDX script per cube)


– If queries very commonly reference measures from >1 measure group, then there would be no value in splitting those measure groups into separate cubes


– The perf benefit must be balanced against the negative impact on the end user view, if frequently a question using measures from just one measure group becomes a question where a measure from another measure group is added. For the user to switch cubes in the middle would be tiresome


– A single large cube is easier to manage, as there is no need to duplicate scripts


 


For calculations that span cross measure groups e.g. where many measure groups have calculations using exchange rate, certainly would not want to use LookupCube. Could either keep all those measure groups in the same cube, or split up and again use linked measure groups to utilize exchange rates in multiple cubes


 


3. By a “data island” do you mean leveraging the linked dimensions and measure group features? If so, any known limitations and restrictions?


Yes, uses linked measure groups, as described above. The dimensions are simply shared, and need not be linked. As far as limitations, there is overhead of having to define calculations etc. on the cube that combines the smaller cubes. Also, Report Builder does not handle such databases that well, as the linked measure groups appear as separate entities in the report model (there is nothing saying the Sales in one cube is exactly the same as Sales in another)


 


4. When can we expect a UDM performance whitepaper?


One is being worked on.

MDX Performance Tips Article

I found a short but very useful MDX Performance article on sqlserveranalysisservices.com. I know I am guilty of breaking not once the first rule (Don’t Filter on Member Property Values). Maintained by Richard Tkachuk (Lead Program Manager on the SSAS team), this site offers other interesting tips and insights.

Chris Webb’s Review

Chris Webb, whose latest book “MDX Solutions” is a must-read for everyone who’s serious about Analysis Services 2005 and MDX, has written a review for my book “Applied Microsoft Analysis Services 2005”.  Thank you, Chris!

Cumulative hotfix package (build 2153) for SQL Server 2005 is available

Microsoft has released a post-SP1 hotfix which addresses various bugs and issues in SQL Server 2005 and its add-on services.  

Where Is My Cartridge?

A little known fact about the SSAS data architecture is that it uses “cartridges” to communicate with the data source. In brief, a cartridge is a XSL stylesheet that defines capabilities of a data source, as well as the rules for optimizing the SQL statements for relational querying and writing. SSAS 2005 ships with set of cartridges for Jet, SQL 70, SQL 2000, Oracle, Teradata, and DB2, which can be found in the \Program Files\Microsoft SQL Server\MSSQL.2\OLAP\bin\Cartridges folder. Vendors can plug in (server restart required) cartridges for other data sources if needed.


One gotcha is when the UDM uses multiple data sources in a single data source view. This scenario requires that the primary data source must be SQL Server because behind the scenes the server uses the SQL Server-specific OPENROWSET statement to extract data from the secondary data source(s). The problem with this approach is that it effectively bypasses the installed cartridge for the non-SQL Server data source. As a result, processing queries that normally execute just fine when the DSV uses that data source only, fail to execute in a multi-data source DSV.


There are at least three workarounds for this predicament. First, you can replace each table in the DSV with a named query which uses the right native syntax. Second, you can link the data server to your SQL Server and wrap the linked server tables with SQL views. A third solution is to split UDM per a data source – a SQL Server UDM and another UDM for the second data source. Then, you can link the dimensions and measure groups from one UDM to another. As you have probably guessed it, all of the above approaches may present maintenance and operational challenges. It will be great if a service pack of a future release solve this issue and honors the cartridges with heterogeneous queries.

MDX Solutions Book on the Horizon

Chris Webb tells us that the MDX Solutions book, which we co-wrote with other MDX gurus, is soon to be released. Based on the resources available on the publisher’s website, the book looks like a very promising read and I am looking forward to get my hands on it. Unfortunately, during my Applied Microsoft Analysis Services 2005 book project, I didn’t have the time to fully research the vast MDX changes in SSAS 2005 and devote the space they deserve. This is where MDX Solutions comes in. I enjoyed the previous edition by George Spofford. I am sure that I will enjoy the sequel as well.

Join me in the "hub"

I’ve just learned that I’ve been selected to present at TechEd 2006 US in Boston! I’ve been honored to present Reporting Services for three consecutive years at TechEd. This time, I will change things a bit and instead of focusing on the development area, I will talk about integrating Reporting Services and Analysis Services. As of now, the main topics I am planning to include are:


1. SSAS report actions
2. MDX Query Designer
3. Server aggregates
4. Building data mining reports
5. Ad-hoc reporting
6. CLR integration


See you in the “hub”!

Nick Barclay’s Review

Nick Barclay from Australia has good things to say about my book “Applied Microsoft Analysis Services in Action”. No, I didn’t sweat blood, but I have to admit that writing this one wasn’t easy! It took me almost twice as long to write it as my previous “Microsoft Reporting Services in Action” book. For the most part, that was because Analysis Services is a focal point of many technologies, including OLAP, data mining, data integration, reporting, performance management, etc. Not to mention that, as Nick said it, it was like trying to hit a moving target because the product was constantly evolving on the way to RTM.


The other main factor for taking so long is that I just can’t write short. I am a firm believer that knowing a product is knowing its limitations. I usually put myself in my reader’s shoes and try to address all possible questions the reader may ask while reading the book. I am committed to point out not only the tool strengths but also its weaknesses (nothing is perfect, right). Of course, this takes time, lots of, lots of time …

Applied Microsoft Analysis Services 2005 – Bestseller on Amazon!

My book “Applied Microsoft Analysis Services 2005” book is a bestseller on Amazon! It is currently ranked No 94 for Computers and Internet (as of Jan 23). The book is now in its second printing (it’s been available on Amazon since 12/15/2005).


Thank you, thank you, thank you!