Posts

SSAS WHERE vs. Subselect Puzzle

I’ve been involved recently in a quest to optimize an SSAS 2008 long-running query that would return the 12-month history of a KPI. Since the KPI was displayed on a dashboard page alongside other KPIs and charts, it was important to optimize the query as much possible. Rephrased to Adventure Works, the query went something like this:

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select -{[Customer].[Customer].&[20075]} on 0 from [Adventure Works])

where {[Sales Territory].[Sales Territory Country].&[United States], [Sales Territory].[Sales Territory Country].&[United Kingdom]}

This query compares the sales for each customer with the customer’s sales for the previous month and returns the number of customers who have increased sales for the past 12 months. Also, the query uses a subselect to exclude a customer and a WHERE clause that restricts the results to USA and United Kingdom only. The real-world query would get the members to filter in the WHERE clause from the UI.

The first optimization technique to point out is that instead of filtering the customer set using the Filter function, the query benefits from Mosha’s summator trick, which alone improved the query performance about ten times. However, the query was still taking long to execute (some 30 seconds with set of 150,000 customers). Much to my surprise though, flipping the WHERE and the subselect clauses cut the query time in half.

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])

where -{[Customer].[Customer].&[20075]}

Here, the member that needs to be excluded was moved to the query WHERE clause, while the rest of the filter went to the subselect. While I don’t know what the server does exactly, my hypothesis is that the WHERE clause is more efficient in restricting the cube space before the rest of the query is executed. I wasn’t able to reproduce the performance gain with the Adventure Works cube probably because is too small or the issue was design specific. Larger cubes may benefit from the filter flip.

IMPORTANT  OOPS [:(]

As Mosha was quick to point out, in particular case where SUM is used, the attempt to exclude a given customer in both cases (subselect and WHERE) will not work. Specifically, filtering out the customer is a no-op, since the Customer.Customer.Customer inside Sum overwrites it. Consequently, the queries don’t exclude this customer from the computations at all and therefore are wrong. A great optimization without QA involvement, indeed :-). However, I still can’t explain why subselect is slower if both options are no-ops. BTW, if the exclude filter was done on another attribute (not Customer), then the subselect would produce the correct results, while the WHERE clause will still be a no-op.

In this particular case, the easiest way to exclude specific customers is to remove them from the set inside the SUM function, as the following query shows. This, of course, will impact the query performance. Based on my test, the member exclusion adds about 25% overhead to the query.

 

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum({[Customer].[Customer].[Customer] – [Customer].[Customer].&[20075]} , [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])

Memory-based Dimensional Model (MDM) on the Horizon

As the news arrived today (see Mosha Pasumansky blog, Chris Webb blog, and Marco Russo blog) , the BI Conference revealed what has been brewing for a while in the Microsoft BI nest. Project Gemini, which one may call Memory-based Dimensional Model (MDM) to draw a comparison with the Unified Dimension Model (UDM), will let business users create ad-hoc cubes that are stored in memory. This will let Microsoft compete with other BI players, such as the Cognos TM1 offering which Cognos acquired from Applix. Accidentally, I dare to predict that MDM will be very similar to TM 1 (watch a demo here) but of course better J.

While I need to get my hands on this Gemini thing to say something worthwhile, I really hope that existing UDM cubes could benefit from it as well, especially in terms of performance. Today, folks are pushing SSAS to its limits. A dashboard page, for example, may need to execute many queries and crunch huge volumes of data to present trend graphs within seconds. It will be cool if Gemini lets you cache subcubes in memory to speed up these scenarios. For example, if you need to implement a bunch of customer-related KPIs, it will be nice if you could tell Gemini to cache in memory or materialize to disk the pre-aggregated at the customer level data and which dimensions can be used to slice these KPIs.

What about giving the business users the option to create ad-hoc cubes? I have to admit I have mixed fillings about this. The term “OLAP” alone is known to cause severe brain crunch to many users. I am a bit skeptical that “off you go, start building your own cubes in Excel” philosophy will really fly. If you package a cool wizard that hides some of the dimensional model complexity, how would verify that the results are indeed correct so you don’t end up with as many versions of the truth as the number of users? How would teach end users MDX to create their own calculations? Not sure how much of your time Gemini will really save if this is its major selling point. But again I may change my point of view as details unfold as life often proves me wrong. Meanwhile, long live MDM and Kilimanjaro, which is the code name for SQL Server.NEXT, scheduled for H1 2010!

Sorting Dates in a PivotTable Report

Issue: You use Excel as an OLAP browser connected to an SSAS cube. You’ve defined the name of the Month attribute in the Date hierarchy in the format MMM-YYYY. While months are sorted in an ascending order in Excel, descending sorting doesn’t work; or rather Excel sorts them as strings. That’s because Excel interprets these captions as strings instead of dates even if you set the Value property of the Month attribute to a field of DateTime type.

Workaround: You can use the Excel custom sorting feature as a workaround while waiting for Excel to become a better OLAP citizen:

  1. Connect to the cube and select the Month attribute to get all months displayed in Excel. Select all month cells.
  2. Click the Excel Office Ribbon button.
  3. In the Popular tab, click the Edit Custom Lists button.

091108_0106_SortingDate1

 

  1. In the Custom Lists dialog box that follows, click Import button to create a new custom list for the dates in the format MMM-YYYY. Click OK 

091108_0106_SortingDate2

As you can see, there is a custom sorting list for months in the format MMM, so descending sort will work if the month captions are in MMM format. However, the end user won’t be able to tell the months apart if the user requests only the Month attribute on the report. So, you kind of have to pick up your own poison.

Personalized Cubes

Scenario: You need to personalize the captions of calculated members or/and KPIs of an SSAS cube. For example, a solution vendor may need to deploy a cube to multiple clients. Instead of maintaining separate cube definitions, the vendor may prefer to change the captions of certain calculated members and KPIs per client. There wasn’t a good story in SSAS 2005 about such requirements. In SSAS 2008, there are at least two techniques to tackle cube personalization.

First, SSAS 2008 introduces a new Caption property to calculated members and KPIs, which Chris Webb has already written about. Vidas Matelis has provided great examples of using the Caption property. Thanks to this enhancement, you can keep the names of calculated members and KPIs the same but change only their captions. This is the good news.

What if you need to make definition changes programmatically, such as to check the type of the client and change captions accordingly? Well, unfortunately the Caption property is not available in the AMO library as it has not been added to the KPI class. What’s worse is that the CREATE KPI statement doesn’t actually add the KPI definition to the KPIs Collection so you cannot programmatically change the caption by enumerate the AMO Kpis collection and update the KPI definition. To make the story short, the only way I’ve found to update the KPI caption is to update the cube MDX script (Cube.MDXScripts AMO collection). This forces you to resort to search and replace text inside the cube script. Sigh… The following code shows how to use AMO to connect to the cube and change the cube MDX script.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using Microsoft.AnalysisServices;

namespace AMO

{


class
Program

{


static
void Main(string[] args)

{


Server server = new
Server();

server.Connect(“<server>”);


Database database = server.Databases[“<database name”];


Cube cube = database.Cubes[“<cube name>”];

        // assume KPIs are defined in the cube default script


MdxScript mdxScript = cube.MdxScripts[0];


Command command = mdxScript.Commands[0];


// search and replace captions in the command.Text

        // command.Text = …

cube.Update(UpdateOptions.ExpandFull);

}

}

}

The second personalization option is the new Personalization Extension mechanism in the SSAS 2008, which I personally haven’t tried out. Microsoft has provided a sample here. A custom personalization extension is a .NET code that is deployed to the server. It lets you create user-specific session-level calculated members and KPIs without having to change the cube definition. The extension changes are transient and disappear when the user’s session ends.

Nasty SSAS 2008 Bug

There is a nasty known bug with SSAS 2008, where if a query that requests two measures with Sum and LastNonEmpty aggregate functions side by side and the query uses NON EMPTY (which all OLAP clients use by default), the server forces the measure with the SUM function to display empty.

Steps to repro:

Use the Analysis Services Tutorial cube (Lesson 10) which is included in the SSAS samples. Don’t use the Adventure Works cube because it has measure expressions for certain measures, so changing a measure aggregation function to LastNonEmpty and deploying the cube results gives the error “Errors in the OLAP storage engine: The metadata for the statically linked measure group, with the name of ‘Internet Sales’, cannot be verified against the source object.” That’s a different issue that has been around since SQL Server 2005 but apparently didn’t get fixed.

1.    Open the Analysis Services Tutorial project from Lesson 10

2.    Open the Analysis Services Tutorial.cube in the Cube Designer.

3.    In the Cube Structure tab, expand the Internet Sales measure group and select the Internet Sales-Order Quantity measure.

4.    In the Properties pane, change the AggregateFunction to LastNonEmpty.

5.    Deploy the cube.

6.    Browse the cube with the cube browser or Excel. Create a report that requests Internet Sales Amount and Internet Sales-Order Quantity measures sliced by the Subcategory attribute of the Product dimension.

Notice that the query forces Internet Sales Amount to empty. While waiting for an official hotfix from Microsoft, you can use one of the following workarounds meanwhile:

1. Don’t use the NON EMPTY clause in the query

or

2. Change the following property in the \Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini file.

<DisableCalcExpressNonEmpty>1</DisableCalcExpressNonEmpty>

Not sure what DisableCalcExpressNonEmpty does as it is not documented but appears to work.

UPDATE 8/18/2008

DisableCalcExpressNonEmpty is a new property in AS 2008. Severe performance degradation is likely to result when set to 1, but it should be comparable to AS 2005 performance if not a little better. A hotfix better come out quickly [:(]

Static Credentials

A new week comes with new issues… I ran into an interesting issue today with Windows integrated security and SharePoint. I was troubleshooting an issue on a behalf of a Windows user that connects to an SSAS server. To repro the issue, I used the SharePoint “Sign in as a different user” feature. To facilitate testing, I decided to save the password in the standard Windows authentication dialog that follows (“Remember my password” checkbox).

After this “convenient” setup, to my surprise all calls to that server went the credentials of that user, including connections to the cube from SQL Server Management Studio and Excel! For example, when I connected to the SSAS database with SSMS and attempted to manage the server, I was greeted with the following message although I have admin rights to the server:

The connected user is not an Analysis Services server administrator. Only an administrator can make changes to server properties. (Microsoft.AnalysisServices.ManagementDialogs)

The SQL Server Profiler revealed that indeed the server impersonates any call under that user. Not sure what exactly happens when you save the password but be careful of this issue with “static” credentials. To correct the issue, use the Sign in as a different user feature again but don’t check the Remember my password option.

Yet Another Post About Analysis Services HTTP Connectivity

With risk to iterate the obvious, here is a lesson learned from the trenches:

Scenario: You want to let external users browse an Analysis Services cube over the Internet.

Setup: You have followed the steps in the Configuring HTTP Access to SQL Server 2005 Analysis Services on Microsoft Windows Server 2003 article by Edward Melomed to set up Analysis Services for HTTP connectivity.

Issue: Windows integrated security doesn’t work.

After a long battle where countless options have been tried, I have to admit a defeat. In our case, we’ve set up identical Windows local accounts on the SSAS server. This worked fine when connecting Report Builder 2.0 to the cube from our home machines. However, it appeared that the firewalls that our clients use make a minced meat of Windows integrated security. Strangely, Excel connects without a problem to the cube using Windows integrated security. However, Report Builder 2.0 chokes big time. I guess this has to do with differences between OLE DB (which Excel uses) and ADOMD.NET (used by Report Builder 2.0).

So, we had to give up on Windows integrated security over Internet. Instead, we went for Basic security with SSL, as follows:

  1. Configure the msmdpump virtual folder for Basic security (this should be the only available security option).
  2. Install the SSL certificate on the IIS server.
  3. In Report Builder 2.0, set up a new data source that uses the Microsoft SQL Server Analysis Services. Click the Edit button on the Data Source Properties dialog box.

080208_0230_YetAnotherP1

  1. In the Connection Properties dialog, enter the URL to the msmdpump.dll, such as https://adventure-works/olap/msmdpump.dll.
  2. Enter the user credentials. Click the Save My Password to avoid being asked to retype the password.
  3. Here is an important step if you want to populate the Connect to a Database drop-down list. Click the Advanced button and locate the Integrated Security setting. Select and clear the SSPI default setting. Click OK.
  4. Back to the Connection Properties dialog, expand the Connect to a Database drop-down list. You should see the list of the SSAS databases the user has rights to access. Select a database.
  5. Click the Test Connection dropdown. The connection should succeed.

The data source is set up now. You can proceed with setting up a dataset with the MDX Query Designer.

Microsoft SQL Server 2008 Feature Pack RC0, June 2008

Today, Microsoft released an updated feature pack for SQL Server 2008 RC0. Among other things, it includes:

  1. A web downloadable installer for Report Builder 2.0 which is not included anymore with the SQL Server 2008 setup
  2. An updated version of the RS add-in for SharePoint
  3. Analysis Services 10.0 OLE DB Provider

Intelligencia Query

Chris Webb, a fellow MVP and MDX guru, who’s blog is a must-read for all SSAS junkies, was kind enough to let me take a look at the Intelligencia Query product he’s been working on for a while with Andrew Wiles. Chris announced the product back in April which is now selling as a commercial offering.

Now, lots of things have been said about the current state of the SSRS-SSAS integration, which is to say the least wanting. In a previous blog, I wrote “In my opinion, the biggest challenge the Microsoft BI initiative faces today is the inadequate support for Analysis Services.” Alas, we won’t witness an improved SSRS-SSAS integration in SQL Server 2008 as Microsoft decided to focus on other areas and priorities.

Rest assured though that Microsoft understands the importance of this integration scenario. I personally have voiced my concerns on a few occasions and have put this on the top of my wish list which I shared with the Reporting Services team. There are good things happening already which make me believe that SQL Server.NEXT (post-SQL Server 2008) will materialize this wish. What’s really needed is Excel-like support of SSAS in Reporting Services, plus calculated members which are already supported.

Then, the question is what to do meanwhile if you target Analysis Services (and you should). One approach is to take the “I’ll stick with Microsoft” approach. This may require you find workarounds for the static schema and other limitations, such as using the Microsoft OLE DB Provider for Analysis Services instead of the built-in SSAS provider. This is not as bad as it seems as little out-of-box thinking usually gets the job done. By the way, I wrote some 50 pages in my next book to cover integrating SSRS with SSAS in as much detail as possible.

The second approach is to look at third-party offerings, such as Radius Producer and Intelligenca Query. What I liked about Intelligenca Query is that it doesn’t require throwing the baby with the water. You can still keep the Microsoft Report Designer but use Intelligenca Query to replace the built-in SSAS provider. This works because just like the built-in provider, Intelligenca Query is implemented as a data extension. I tested Intelligenca Query with SQL Server 2008 and it worked without any issues with the BIDS Report Designer and Report Builder 2.0 (aka Report Designer Preview).

The good thing about Intelligenca Query is that it lets you plug in an arbitrary MDX query. Behind the scenes, it pulls out a little trick where it rewrites the report dataset by transposing columns to rows. Consequently, you have to use the matrix region to rotate the rows back to columns to recreate the original query results. Now, my ideal provider would support a dual mode where I can tell it not to rewrite anything but just to give me the results. Of course, if another dimension member is added, it won’t show up on the report columns (assuming a table with fixed columns) but I can see this being useful for notice users that create ad hoc reports or when the columns are fairly static.

051508_0237_Intelligenc1

Another thing I liked, of course, about Intelligenca Query is that it eliminates the nagging issue of server aggregates which are the aggregates for the dimension All members. With the built-in provider, you have to explicitly request them by using the Aggregate function but they won’t be retuned if you hand-code your query. Intelligenca Query simply brings them as additional rows. What if you don’t want them, such as when you need a group footer? No problem, just exclude them from the query and group on the rest of the rows.

Another cool thing is working with parent-child hierarchies. The MS built-in provider takes this strange design pattern where it dumps all members in a single column, thereby making it virtually impossible to create subtotals per level. With Intelligence Query this issue disappears:

051508_0237_Intelligenc2

I also liked the graphical query designer which is pretty much in par with the Microsoft graphical MDX Query Designer. On the downside, some outstanding work is required to handle parameters and synchronize the text query with the graphical designer.

If the SSAS built-in provider is driving you nuts, I encourage you to take a look and evaluate Intelligenca Query.

SSAS HTTP Connectivity and SharePoint

Scenario

You won’t enable SSAS HTTP connectivity via a SharePoint site.

Issue

You followed the instructions in the Configuring HTTP Access to SQL Server 2005 Analysis Services on Microsoft Windows Server 2003 whitepaper but you get ” Failed to start monitoring directory changes” error when you try to access the SSAS server by its pump URL, such as http://www.adventure-works/olap/msmdpump.dll.

Solution

Ignore the step where you had to create an olap pool and assign the OLAP vroot to it. Instead, assign the vroot to the same IIS application pool as the one the SharePoint site belongs to, such as SharePoint -80. In addition, make sure you enable the web extension to the pump dll, as explained in the above article.

If you have created non-domain Windows accounts on the server for testing, make sure that the server and local Windows accounts (the one the user will use to log in to the user machine) are synchronized, that is, have the same name and password. For example, if I create a non-domain local account Bob on the server where SSAS is installed, Bob must log in to his machine as Bob and both accounts (user and server) must have the same password.