I’ve been involved recently in a quest to optimize an SSAS 2008 long-running query that would return the 12-month history of a KPI. Since the KPI was displayed on a dashboard page alongside other KPIs and charts, it was important to optimize the query as much possible. Rephrased to Adventure Works, the query went something like this:
WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)
MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])
SELECT [Measures].[CustomersWithIncrease] on 0,
ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1
from (select -{[Customer].[Customer].&[20075]} on 0 from [Adventure Works])
where {[Sales Territory].[Sales Territory Country].&[United States], [Sales Territory].[Sales Territory Country].&[United Kingdom]}
This query compares the sales for each customer with the customer’s sales for the previous month and returns the number of customers who have increased sales for the past 12 months. Also, the query uses a subselect to exclude a customer and a WHERE clause that restricts the results to USA and United Kingdom only. The real-world query would get the members to filter in the WHERE clause from the UI.
The first optimization technique to point out is that instead of filtering the customer set using the Filter function, the query benefits from Mosha’s summator trick, which alone improved the query performance about ten times. However, the query was still taking long to execute (some 30 seconds with set of 150,000 customers). Much to my surprise though, flipping the WHERE and the subselect clauses cut the query time in half.
WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)
MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])
SELECT [Measures].[CustomersWithIncrease] on 0,
ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1
from (select {[Sales Territory].[Sales Territory Country].&[United States],
[Sales Territory].[Sales Territory Country].&[United Kingdom]}
on 0 from [Adventure Works])
where -{[Customer].[Customer].&[20075]}
Here, the member that needs to be excluded was moved to the query WHERE clause, while the rest of the filter went to the subselect. While I don’t know what the server does exactly, my hypothesis is that the WHERE clause is more efficient in restricting the cube space before the rest of the query is executed. I wasn’t able to reproduce the performance gain with the Adventure Works cube probably because is too small or the issue was design specific. Larger cubes may benefit from the filter flip.
IMPORTANT OOPS [:(]
As Mosha was quick to point out, in particular case where SUM is used, the attempt to exclude a given customer in both cases (subselect and WHERE) will not work. Specifically, filtering out the customer is a no-op, since the Customer.Customer.Customer inside Sum overwrites it. Consequently, the queries don’t exclude this customer from the computations at all and therefore are wrong. A great optimization without QA involvement, indeed :-). However, I still can’t explain why subselect is slower if both options are no-ops. BTW, if the exclude filter was done on another attribute (not Customer), then the subselect would produce the correct results, while the WHERE clause will still be a no-op.
In this particular case, the easiest way to exclude specific customers is to remove them from the set inside the SUM function, as the following query shows. This, of course, will impact the query performance. Based on my test, the member exclusion adds about 25% overhead to the query.
WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)
MEMBER CustomersWithIncrease as Sum({[Customer].[Customer].[Customer] – [Customer].[Customer].&[20075]} , [Measures].[increaser])
SELECT [Measures].[CustomersWithIncrease] on 0,
ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1
from (select {[Sales Territory].[Sales Territory Country].&[United States],
[Sales Territory].[Sales Territory Country].&[United Kingdom]}
on 0 from [Adventure Works])