Posts

Calculated Member as Regular Measure

One of my favorite modeling tricks when I need an UDM calculated measure is implementing it as a regular measure by creating a named calculated column in DSV. I set the expression of the named calculation column to NULL to minimize storage. This approach has a couple of advantages:

  1. It allows you to set a default aggregation function, e.g. SUM(). In comparison, a calculated member (created with CREATE MEMBER) cannot have a default aggregated function. This could be useful if you need to perform a calculation only at the leaf members of a dimension and then sum the results up.
  2. You can scope across several measures. For some obscure reason, the SCOPE operator doesn’t support calculated members. For example, the following statement will trigger a “The Measures dimension is used multiple times” (or similar) error on deploy if Member 1 and Member 2 are calculated members:

    SCOPE

    (

    [Date].[Date].[Date].Members,

    {

        [Measures].Member1,

        [Measures].Member2

    }

    );

     

    but it will work if they are regular measures. If you want to use calculated members you can get away by changing the statement to:

     

    SCOPE

    (

    [Date].[Date].[Date].Members

    );

    {

        [Measures].Member1,

        [Measures].Member2

    } = <some assignment>

    )

     

    but this may have some side effects, e.g. if you can’t format this scope because everything that intersects [Date].[Date].[Date].Members will be formatted that way as well.

One potential gotcha that bit me when implementing a calculated member as a regular measure is that DSV defaults the named calculation to Integer data type (because its expression is NULL) and the Integer type propagates to the regular measure. As a result, if your calculated expression is a ratio, the results may be rounded in some clients (e.g. Report Builder) although the actual query returns the correct results. To correct this, open DSV source and change the data type of the named calculation in DSV to <xs:decimal> (you can’t change it in design mode).

SQL Server 2005 Service Pack 2 is Born

As an update to my previous post, SQL Server 2005 SP2 is now officially available. The SP2 build is 9.00.3042. A SP2 landing page is available too that includes links to the SP2 release, KB articles, marketing information about the benefits of SP2.


As a personal contributor to Service Pack 2 (mainly in the areas of Reporting Services SharePoint integration and Analysis Services), I hope you enjoy it!

Feedback on Analysis Services Performance Guide

Now that I’ve read the Analysis Services Performance Guide (or shall we call it a mini-book) which I announced in my previous blog, I found it to be a great read. I’d recommend it to anyone who would like to get more insights not only about performance tuning UDM but also about the inner workings of the server.

Some caveats… The guide doesn’t answer the perennial and fundamental question about partitioning a large UDM. As I explained in my UDM Data Islands blog, Microsoft scaled down from the initial “super cube” approach and now advocates splitting a large cube into smaller subcubes (which Jamie McClellan referred to as “data islands”) for performance reasons. What I was hoping to find is some performance guidelines and metrics about at what data loads should this split occur. Since this is so important from a performance standpoint, I failed to understand also why there is no reference to linked measure groups and dimensions whatsoever.

Finally, the guide is a bit light from a capacity planning and load-balancing standpoint. Hopefully, there will soon a refresh of the “Creating Large-Scale, Highly Available OLAP Sites: A Step-by-Step Guide” whitepaper.

Analysis Services 2005 Performance Guide

The highly-anticipated Analysis Services 2005 performance whitepaper is finally here. I found a few things intresting glancing at it:



  • This is a colossal work spanning some 120 pages (no wonder it took so long [:)].

  • The guide was written by the top architects on the SSAS team.

  • It specifically references SQL Server 2005 SP2 probably because SP2 brings many performance optimizations to SSAS.

  • The guide mentions a new aggregation utility (Appendix C) which you can use to manually create aggregation designs. 

Something to sure keep me busy when the winter storm hits Atlanta tomorrow…

Kudos to Report Builder as UDM Client

Its UDM support limitations withstanding (a partial list here), the Report Builder should definitely be on your list if your requirements call for ad-hoc reporting from an SSAS 2005 cube. Here are some Report Builder features that I particularly like:



  1. Ability to pre-filter the data on the report before the report is run. In comparison, Excel takes the optimistic approach to load all dimension members before allowing you to set a filter. This can surely send Excel in a la-la land if your cube has large dimensions (hundred thousand or more).

  2. Support of calculated columns. For some obscure reasons, the Excel UI doesn’t have an UI interface for creating calculated members (you can do so programmatically however).

  3. Standard report look and feel. In comparison, Excel limits the user to the PivotTable fixed layout (assuming that the user doesn’t convert the report to formulas).

  4. My Reports which allows the business user to save her reports in her own area of the report catalog if she doesn’t have rights to write to other folders.


  5. Nice filter support. This may not be so obvious but each SSAS attribute hierarchy supports an InstanceSelection property which the Report Builder honors. For example, the attached image shows what happens if you set InstanceSelection to List or FilterList (no UI difference between the two). Another filter UI selection is dropdown.

Of course, the Report Builder-UDM integration will improve in future releases (believe me, I am banging hard on the SSRS-SSAS integration door) to make it even more compelling choice for UDM reporting.

Linked Attribute Hierarchies

In UDM, the cube space is defined by attribute hierarchies. Dimensions are just logical containers of attribute hierarchies. It will be great if the next release of SQL Server (Katmai) could expand further on the attribute nature of UDM and solve some nagging issues that modelers currently face.

Let’s consider an example. Say, you have Geography and Customer dimension tables (see the AdventureWorksDW database). It is logical to expect that end users may be willing to browse data by the Geography-Customer hierarchy. As common this requirement is, modelers today need to make some tradeoffs to meet this requirement.

To start with, you could embrace the star schema and build two UDM dimensions (Geography and Customer) on top of the corresponding dimension tables. Then, the end users can drop these two dimensions side-by-side in their favorite OLAP browser to slice the data by Geography and Customer. Behind the scenes, the server will cross-join these two dimensions . Although cross-joing dimensions may meet the business need, there are a couple of well-known issues with this approach. First, you won’t be able to define easily useful hierarchy-related calculations, such as ratios of children to parent because there are no user-defined hierarchies. Second, cross-joining large dimensions may impact performance.

Alternatively, you could decide to create a user-defined hierarchy (aka multi-level hierarchy) that spans Geography and Customer attributes. This is where the things get trickier. Assuming a star schema, both dimension tables would join the fact table. To build a Customer dimension spanning both tables, you need to bridge them via the fact table – a definite no-no situation. The other solution is to duplicate the Geography columns to the Customer dimension table but this leads to duplication of database and UDM storage and will complicate the ETL processes.

At this point, ditching the star schema and showflaking the Geography table off the Customer table starts to look appealing. Now, you could build the Customer dimension on top of the Geography and Customer dimension tables – the approach that the sample AdventureWorks UDM takes. The first tradeoff is that you may need to fight an uphill battle with fans of the “classic” dimensional modeling that swear by star schemas. Second, this approach still results in duplication of UDM storage. That’s because, when the Geography and Customer dimensions are processed, the server will still build all dimension attribute hierarchies irrespective of the fact that some of them of identical. For example, if you have a City attribute (from the Geography table) in the Customer dimension, the server will build two City attribute hierarchies – one when building the Geography dimension and another one when building the Customer dimension.

It will be nice if a future SSAS version supports linking attributes from one dimension to another. With this feature, the database schema (star or snowflake) could become irrelevant. The City attribute from the Geography dimension could be linked to the Customer dimension. Of course, the cube schema will still be constrained by the underlying database schema and table relationships. However, cross-dimension attribute relationships will certainly enable more flexible scenarios.

The Many-to-many Revolution

One of the major UDM enhancements that transcends the boundaries of traditional OLAP is flexible dimension relationships, including many-to-many, referenced, and fact relationships. Marco Russo, who helped me tremendously with my Analysis Services book, ha s just published a very comprehensive whitepaper (shall we call it a mini book?) about many-to-many dimension relationships. I had the pleasure to be one of the reviewers.

For those who are not familiar with his work, Marco is one of the few people on this planet that has deep understanding and real world experience with Analysis Services since its early stages. Thus, this whitepaper is a valuable resource that discusses practical implications of the UDM many-to-many relationships. The real-life scenarios presented in the whitepaper unlocks the mysteries of this revolutionary OLAP concept.

Don't forget to check Marco's blog and his SQLBI.EU website for more UDM insights. Great work, Marco, and looking forward to a book from you! [:O]

Ampersands Gone Wild

Thanks for the Geoff’s feedback on the discussion list, today I was able to demystify one of the SSRS-SSAS integration “gotchas” that has been pestering me for quite some time.

Sometimes a report may need the Jump to URL navigational feature to open a parameterized OLAP report. Since the UDM member unique names contain & (to designate the key), I have been unable to find a way to construct a Javascript link that correctly escapes & in the report parameters, e.g.:

="javascript:void(window.open('http://localhost/ReportServer?/OLAP/Daily Product Sales&DateTimeIndex=[Date].[Time Index].&[2003]&SalesTerritoryGroup=[Sales Territory].[Group].&[North America]&rs:Command=Render’))"

Here, the Daily Product Sales report takes two parameters (DateTimeIndex and SalesTerritoryGroup). As I mentioned in Chapter 8 of my book, even if you use the escape code of ampersand %26 (or the escape Javascript function), the browser will “helpfully” unescape the value back to & and the Report Server will choke. The trick is to use %2526 instead of just %26, as the next example shows:

="javascript:void(window.open('http://localhost/ReportServer?/OLAP/Daily Product Sales&DateTimeIndex=[Date].[Time Index].&2526[2003]&SalesTerritoryGroup=[Sales Territory].[Group].&2526[North America]&rs:Command=Render’))"

%2526 is needed in a javascript call with Jump to URL because it gets processed by the browser twice. %2526 goes to %26 which goes to &.

Dundas Chart for OLAP Services

If you are on a lookout for a web-based smart chart that can browse SSAS 2005 cubes, don’t look further. Enter Dundas Chart for OLAP Services! Despite the name (what’s Dundas anyway?), I really fell in love with this control after playing it with its demos for a while. The chart can connect to both server and local cubes. The attached image shows the Dundas Chart connected to the Adventure Works DW cube.

The beauty of the Dundas Chart is that it’s more than a chart. It is a web-based OLAP browser. And it’s AJAX-enabled so the page doesn’t re-post as result of user actions! From an end-user perspective, authoring a chart is a matter of dragging and dropping dimensions and measures. The same experience as creating an OLAP-based pivot or chart report in Excel.

Given the void left by OWC and the lack of Microsoft OLAP browser controls, the Dundas Chart for OLAP Services is definitely something to consider when planning for multi-dimensional web reports.

UDM Many-to-many Relations and "AND" queries

Recently I got an interesting question about querying UDM many-to-many relationships. For example, in my book I demonstrated how you can implement an UDM model (Bank cube) that has a many-to-many relationship between bank customers and their accounts. That’s because a customer could have more than one account and an account can belong to more than one customer (a joint account).


 


But what if you want to find only the joint accounts that are owned by any two customers?  The following query returns the accounts owned by Bob and Alice:


 


SELECT


NON EMPTY {[Measures].[Balance]} ON COLUMNS,


Intersect


   (


     Exists([Account].[Account Number].[Account Number],


[Customer].[Full Name].&[Bob], “Customer Account”),


Exists([Account].[Account Number].[Account Number], [Customer].[Full Name].&[Alice], “Customer Account”)


   )  ON ROWS


 FROM Bank


 


The trick is to evaluate the sets over the hybrid “Customer Account” measure group which is actually a dimension table but plays a role of a measure group. The Exists function returns the accounts that the given customer owns. The Intersect function returns the subset of the customer sets.