Posts

Empty Affairs

If you have a cube with large dimensions you may have come across the following error:

The expression contains a function that cannot operate on a set with more than 4,294,967,296 tuples.

The culprit is the NON EMPY (or NonEmpty), as mentioned in the Chris Webb’s blog and Mosha Passumansky’s blog. It turns out that NonEmpty simply gives up when the number of tupples in the cross-join members exceeds 4GB. The problem is all Microsoft SSAS clients (Excel, Reporting Services, Report Builder, etc.) are blissfully unaware of the trap to come and would happily cross-join dimension members as you add more dimensions to the report. Take for example the following (simplified) query produced by the Report Designer when the report shows the customer name and its accounts:

SELECT NON EMPTY { [Measures].[Sales] } ON COLUMNS,
NONEMPTY { ([Customer].[Customer].[Customer].ALLMEMBERS *
[Account].[Customer Account Number].[Customer Account Number].ALLMEMBERS)} ON ROWS
FROM <some cube>
WHERE ([Date].[Date].&[20070712])

In my case, the server returned the 4GB error with about 150,000 customers and 200,000 customer accounts. I can think of two approaches to avoid this error:

  1. Use Autoexists – Consider placing the frequently-queried large attribute hierarchies in the same dimension. Not only this may avoid the error but also it may give you better performance since the server knows how to cross-join attribute hierarchies within the same dimension efficiently. Let’s say you can add the Customer attribute to the Account dimension so it exists both in the Customer and Account dimensions. This query executes successfully:

    SELECT NON EMPTY { [Measures].[Sales] } ON COLUMNS,
    NONEMPTY { ([Account].[Customer].[Customer].ALLMEMBERS *
    [Account].[Customer Account Number].[Customer Account Number].ALLMEMBERS)} ON ROWS
    FROM <some cube>
    WHERE ([Date].[Date].&[20070712])

  2. Reduce large sets with Exists() before NonEmpty()

    SELECT NON EMPTY { [Measures].[Sales] } ON COLUMNS,
    NONEMPTY { ([Customer].[Customer].[Customer].ALLMEMBERS *
    Exists([Account].[Customer Account Number].[Customer Account Number].ALLMEMBERS, [Date][Date].CurrentMembers, “<Measure Group Name>”)} ON ROWS
    FROM <some cube>
    WHERE ([Date].[Date].&[20070712])

In this case, the Exists function returns only the customer accounts that have data in the measure group for the date requested. The net result is that you pass a much smaller set of accounts to the NonEmpty function.

Analysis Services Query Performance Top 10 Best Practices

I came across this little gem which I haven’t noticed so far. A nice summary of the 10 things you should do to optimize the UDM query performance. The best one is kept for last – scale out when you can no longer scale up. So true, if I could only convince customers to do so J

Protect UDM with Dimension Data Security

SQL Magazine published the first part of my article Protect UDM with Dimension Data Security in its July issue. This article explains the fundamentals of dimension data security. I am working on making the article available for public access (as of now, it requires subscriber-level access). The article code can be downloaded from the publisher’s website and from my website.

I am currently writing the second part (tentatively named Protect UDM with Dynamic Dimension Security) whose focus is dynamic dimension security. It will present two implementation approaches for implementing dynamic dimension security which are harvested from a real-life project: factless fact table and integrating with external security service.

Trust Relationships

If you use VS.NET deployment to deploy your SSAS project from one domain to another, you may get the following obscure error message on deploy.


The trust relationship between the primary domain and the trusted domain failed.


This may happen even if your workstation and the deploy server are on the same domain. The most likely reason for this error is that you have added Windows groups or users from the old domain as members to SSAS role definitions. To fix the error clear the role membership list.

TechEd 2007 US Memoirs

I am back from an one-week vacation to Clearwater, Florida followed by TechEd 2007 US in Orlando. Overall, nothing earth – shattering on the BI arena from TechEd 2007. Product groups are between releases so most of the content was SQL Server 2005-based with a few sneak previews of some of the good stuff to come in Katmai.


My presentation attracted a fairly large crowd. Almost 500 folks joined my presentation “Applied Microsoft Reporting Services 2005” which was scheduled for the first breakout slot after the keynote on Monday. Thank you! I’ve made the source code and slides available for download on my website. Courtesy to Marco Russo, I’ve also uploaded a couple of photos to capture the moment of glory.


Now that I can open my mouth about some of the cool features coming up in the Katmai release of Reporting Services that were officially announced, here are the most important ones:



  1. Standalone report designer – both the VS.NET Report Designer and the Report Builder will share the same full-featured designer. Note that “standalone” doesn’t mean “embeddable”. While a post-Katmai release may bring us a designer which can be hosted in a custom application this will probably not happen in Katmai. Sigh…

  2. Tablix region – The table and matrix regions unite and give a birth to a brand new region called tablix. This versatile control can handle it all. By changing a few properties, the tablix can morph itself into a table, crosstab, or list report! The tablix region will enable reports that were not possible or required hacks in the past to get around the matrix limitations, such as stacked columns in a crosstab reports, multiple subtotals, and multiple pivot columns. I wish I had this baby a year ago…

  3. RDL Object Model – The Katmai release will ship with public RDL Object Model. This is a great news for developers (myself included) that need to generate RDL programmatically.

  4. Acquisition of the Dundas RS controls – MS acquired the Dundas chart, gauge, and map for Reporting Services. This will sure spice up the report graphical experience and position MS to compete better with other report vendors.

  5. Acquisition of the Soft Artisans Office Writer technology – This will let business users embed reports in Microsoft Word and Excel documents.

  6. Many designer enhancements – HTML inline support, better designer experience, export to Word 2000-2007, etc.

I hope I haven’t missed something important. The first Katmai CTP build to include some of the new RS enhancement is due in July.


Nothing too exciting on the SSAS side of things as the main team of the next release is “don’t rock the boat”. It will focus mainly on improving the server performance, manageability, and robustness.

Companion for MS Analysis Server

SQL Minds has released Companion for MS Analysis Server. Companion for MS Analysis Server is a product that helps with the health monitoring and performance tuning of Microsoft SSAS 2005. The product offers historical baseline, custom aggregations and their usage, meta data browsing, cube browsing, various reports on SSAS inner workings, trace information on slowest SQL, DMX, MDX and XMLA requests, processing information broken down by steps, capacity planning and forecasting, and more…

Cube Browser Woes

I had an interesting issue this week. I was working on a small test cube with financial data. The lowest grain in the measure group was the Account dimension which represents a customer bank account. I had to implement rolling 12 calculations on semi-additive measures (e.g. Avg Account Balance) by weighting the balances at the account level:

Rolling 12 Bal = ∑(Number of Account Active Days * Avg Account Balance) / ∑ (Number of Days for 12-month period

 

Since I had to weight the account balances by the number of the number of the days the account was active, I decided to scope at the account level, e.g.; SCOPE (Leaves(Account), Leaves(Date)…). Also, I had to use the ParallelPeriod() MDX function to sum the weighted balanced for the 12-month period.  To my surprise, the small test cube took an enormous time to initialize in the Cube Browser. The Cube Profiler showed as many subcube events as the number of the accounts of the cube. It appeared that the server was initializing the expression for each account. This didn’t make sense at all considering the fact that the server should only write the formula in the scope cells which should happen pretty fast even with large cubes and very granular scope assignments. After some digging, it turned out that the culprit wasn’t the server but the Cube Browser L. For some obscure reason, the Cube Browser issues the following statement each time you reconnect even with an empty report.

 

select filter([Measures].members, [Measures].currentmember.member_caption = [Measures].currentmember.member_caption) on columns from [<cube name>]

 

This statement was causing the server to evaluate the Account All member which was triggering the formula evaluation for each account. Oops! The moral of this story is to test your cube initialization time in SQL Server Management Studio. As for me, I decided to reduce the scope (always a good idea) to Leaves (Date) only by pro-rating the Active Days in the data source so I could use the days in the month. This makes the formula universal on rows irrespective of the cube slice and makes the Cube Browser happy.

SSAS Performance Samples

Thanks for Russell Christopher’s blog, I’ve learned that Microsoft donated an Analysis Services Load Simulator (ASLoadSim) sample project which lets you load test Analysis Services using the VS.NET 2005 test capabilities. While I am to try the code sample, I am very impressed by the whitepaper included in the sample which got harvested from Project REAL.

Chris Webb also contributed in the SSAS performance area by donating his MDX Script Performance Analyzer sample which I am yet to try as well.

Report Builder and Text-based Calculated Members

Although not a common practice, UDM supports defining text-based calculated members. For example, here is an useless but valid text-based calculated member:


Create Member CurrentCube.[Measures].[Hello] As “Hello”;


Much to my surprise, a text-based member won’t show up in Report Builder. That’s because the Report Builder defaults the member’s data type to float and the text is simply “lost” when the report is run. How does the Report Builder determine the field data type with UDM as a data source. With regular measures, it’s simple: the Report Builder derives the field type from the DSV data type. It seems that for calculated members, however, the Report Builder always defaults the data type to float.


While there may other workarounds to resolve this issue, one way is to “correct” the Report Builder invalid assumption about the member data type. Since one would typically auto-generate the Report Builder model when UDM is used as a data source, the ultimate hack is to change the model directly. To do so, go to the model properties in the Report Manager and click the Edit button to save the model as a file. Then, find the text-based member (attribute) by name and change its type from float to string. Then, back to the Report Manager, click the Update button to update the model from the file. As a result, the report will show the member text values correctly.

Calculated Member as Regular Measure

One of my favorite modeling tricks when I need an UDM calculated measure is implementing it as a regular measure by creating a named calculated column in DSV. I set the expression of the named calculation column to NULL to minimize storage. This approach has a couple of advantages:

  1. It allows you to set a default aggregation function, e.g. SUM(). In comparison, a calculated member (created with CREATE MEMBER) cannot have a default aggregated function. This could be useful if you need to perform a calculation only at the leaf members of a dimension and then sum the results up.
  2. You can scope across several measures. For some obscure reason, the SCOPE operator doesn’t support calculated members. For example, the following statement will trigger a “The Measures dimension is used multiple times” (or similar) error on deploy if Member 1 and Member 2 are calculated members:

    SCOPE

    (

    [Date].[Date].[Date].Members,

    {

        [Measures].Member1,

        [Measures].Member2

    }

    );

     

    but it will work if they are regular measures. If you want to use calculated members you can get away by changing the statement to:

     

    SCOPE

    (

    [Date].[Date].[Date].Members

    );

    {

        [Measures].Member1,

        [Measures].Member2

    } = <some assignment>

    )

     

    but this may have some side effects, e.g. if you can’t format this scope because everything that intersects [Date].[Date].[Date].Members will be formatted that way as well.

One potential gotcha that bit me when implementing a calculated member as a regular measure is that DSV defaults the named calculation to Integer data type (because its expression is NULL) and the Integer type propagates to the regular measure. As a result, if your calculated expression is a ratio, the results may be rounded in some clients (e.g. Report Builder) although the actual query returns the correct results. To correct this, open DSV source and change the data type of the named calculation in DSV to <xs:decimal> (you can’t change it in design mode).