Implementing Cube Settings

I had to tackle an interesting requirement the other day. A cube had a number of KPIs defined but the threshold values that were used to define the “good”, “bad”, and “so-so” status ranges had to be defined as configurable values at any level of the organization dimension and KPIs. I settled on implementing a measure group that intersects the organization dimension and a new KPI dimension that lists the KPI names.

011409_0257_Implementin1

For example, as the screenshot shows, each KPI has two threshold values (Threshold Value1 and Threshold Value2) which can be defined at any level of the organization hierarchy. Note that the parent values read the setting values directly from the fact table instead of being derived by rolling up from the children.

The first implementation approach that comes to mind is to use the None AggregateFunction function for the measure group so you read the measure values associated with the member directly from the fact table. Some experimenting revealed the following “noble” truths about the None AggregateFunction:

    1. BOL is wrong. I refer to the following statement:
      (None AggregateFunction) No aggregation is performed, and all values for leaf and nonleaf members in a dimension are supplied directly from the fact table for the measure group that contains the measure. If no value can be read from the fact table for a member, the value for that member is set to null.

Specifically, When None is used, the server loads the fact data into the measure group leaves (just like with other aggregate function) but it doesn’t roll up the values to the dimension leaves. See this Mosha’s post to understand the difference between the dimension and measure group leaves.

  1. The way it stands, the None function brings a questionable value. A narrow case exists where you may want to perform assignments or calculations in the measure group leaves as the above post demonstrates. Why you would want to do this instead performing them at the fact table level (or as named calculations) for obvious performance advantages is beyond me.
  2. There is no way to bring the measure leaf values to the dimension leaves. Shoot me a note if you manage to come with some clever hack to do so.

Note: It would have been great if the None function behaves as described in the documentation. Even better, it could be nice if the server automatically brings up the fact value when the cube slice results in one fact record, i.e. the user has drilled down to measure group grain. This will make it possible to implement text-based measures, such to capture comments, or other text-based attributes, that don’t justify moving them to a dimension. If you agree, vote for this enhancement on connect.microsoft.com.

So, how do we get around this predicament? One option is define a custom ~ operator for the Organization parent-child dimension so the server doesn’t roll up the measure. However, the custom operator will be applied to all measures sliced by the Organization dimension which more than likely will be an issue. Of course, a trivial workaround is to duplicate the Organization dimension but this leads to other disadvantages. IMO, a better solution might be to use the DataMember property and a scope assignment:

SCOPE (

MeasureGroupMeasures(“KPI Configuration”),

[Organization].[Organization Hierarchy].Members,

[KPI].[KPI].[KPI]

);

this = ([Organization].[Organization Hierarchy].CurrentMember.DataMember);

END SCOPE;

I scope at the KPI Configuration measure group where it intersects with the members of the Organization and KPI dimensions. The AggregationFunction property of the measures in the KPI Configuration measure group is now set to SUM. The tuple on the right side of the this assignment returns the measure value associated directly with the current data member. The new result is that this assignment overwrites the SUM function with the measure leaf value.

Reporting Services 2008 Dashboard Goes Live

Thanks to the Robert Bruckner’s blog, I’ve learned that Microsoft has published the Reporting Services 2008 Information Aggregator. It is a cool dashboard page that aggregates a variety of resources related to Reporting Services, such as training content, white papers, blogs and more. I’m proud to be featured as an MVP J The page also includes to my most recent books and training videos I’ve authored.

Kudos go to the Reporting Services user education team who worked hard to implement this page. Check out the SSRS aggregator and add it to your favorites.

SQL Server 2005 Service Pack 3 is Out

Yesterday, Microsoft released SQL Server 2005 Service Pack 3. Reporting Services Report Builder added support with Teradata. Note that some manual configuration is required to turn things ‘on’ as well as you need to install the .Net provider from Teradata.

Dimensionalizing Pivoted Measures

Issue

You have a table with pivoted measures, such as:

121608_0043_Dimensional1

Use the following statement to create and populate the table:

CREATE
TABLE pvt
(VendorID int, DateID int, MaturityAmount1 int, MaturityAmount2 int,

MaturityAmount3 int, RepricingAmount1 int, RepricingAmount2 int, RepricingAmount3 int);

GO

INSERT
INTO pvt VALUES (1,1,4,3,5,4,4, 100);

INSERT
INTO pvt VALUES (2,1,4,1,5,5,5, 100);

INSERT
INTO pvt VALUES (1,2,4,3,5,4,4, 200);

INSERT
INTO pvt VALUES (2,2,4,1,5,5,5, 200);

INSERT
INTO pvt VALUES (3,1,4,3,5,4,4, 300);

INSERT
INTO pvt VALUES (3,2,4,3,5,4,4, 400);

INSERT
INTO pvt VALUES (3,3,4,3,5,4,4, 100);

INSERT
INTO pvt VALUES (4,1,4,2,5,5,4, 100);

INSERT
INTO pvt VALUES (5,1,5,1,5,5,5, 200);

GO

Let’s say you need a way to unpivot the results in the following format which is more suitable for OLAP:

121608_0043_Dimensional2

In this way, you eliminate the repeating measure columns with a single column. Here, the BucketID column is introduced so you could join the results to a dimension to let the user cross-tab the results on columns.

Solution

Usually, ETL processes would take care of transforming data in format suitable for analysis especially with large data volumes. However, with smaller fact tables and an OLAP layer directly on top of the operational database, such as having SQL views directly on top of the source database, you may need an ad-hoc approach to unpivot the results. Assuming you are fortunate to have SQL Server 2005 or above, you can use the SQL Server UNPIVOT function to unpivot on multiple columns. The following query does the trick:

SELECT VendorID, DateID,
CAST(REPLACE(Maturity,
‘MaturityAmount’,
)
as
int)
as BucketID,

MaturityAmount, RepricingAmount

FROM


(SELECT VendorID, DateID, MaturityAmount1, MaturityAmount2, MaturityAmount3,

RepricingAmount1, RepricingAmount2, RepricingAmount3 FROM pvt) p

UNPIVOT


(MaturityAmount FOR Maturity IN
(MaturityAmount1, MaturityAmount2, MaturityAmount3))
AS UnPivotedMaturity

UNPIVOT


(RepricingAmount FOR Repricing IN
(RepricingAmount1, RepricingAmount2, RepricingAmount3))
AS UnPivotedRepricing

WHERE
REPLACE(Maturity,
‘Maturity’,
)
=
REPLACE(Repricing,
‘Repricing’,
)

Notice that a WHERE clause is needed to join the two unpivoted sets so you don’t end up with a Cartesian set. The net result is that the number of rows in the unpivoted dataset is:

Number of Rows in Pivoted (original) Table * Number of Repeated Columns

So, in our case, the dataset will have 9 * 3 = 27 rows.

Troubleshooting the SharePoint Add-in for Reporting Services

A common issue when setting up Reporting Services integration with SharePoint is that the SharePoint add-in is not installed properly although the setup program doesn’t indicate any issues. More than likely, the culprit is that the person who installed the add-in is not SharePoint Web farm administrator and Site Collection administrator.

As a first stop for troubleshooting SharePoint integration issues, you should examine the add-in log after the add-in setup completes to verify if it was installed successfully. You’ll find the log file (RS_SP_<N>.log) in your temp folder, such as \Documents and Settings\<login>\Local Settings\Temp. Specifically, check that all activation tasks have completed successfully and that there are no errors. If all is well, the log files should look as follows:

User: <userid>

Installing Report Server feature.

Calling copyappbincontents command.

copyappbincontents command completed successfully.

Adding Report Server feature to farm.

Installed Report Server feature.

Activating Report Server feature to root level site collections.

Activating feature in web app ‘SharePoint – 80’

Activating feature to root site collection: http://<server>

Activated feature for root site collection: http://<server>

Activating Report Server feature in all admin site collections.

Activating feature in web app ”

Activating feature to root site collection: http://<server>:35000

Feature already activated in site collection http://<server>:35000

Cumulative Update 2 for SQL Server 2008

Microsoft has released the Cumulative Update 2 build (build number 1779) for SQL Server 2008. The update is not publicly available as of now. It has to be released individually based on whether you are running into a bug fixed in the build and it is password-protected. Among other things, it fixes the freeze issue associated with opening SSAS projects in BIDS.

Referencing Parameters in SSRS Calculated Members

Issue: You author an SSRS report that connects to a cube. You want to create a calculated member that uses a report parameter. For example, the Selector calculated member in the following query returns 1 if a given employee has exceeded 1 million in sales for the dates specified by the user in the report parameter.

WITH MEMBER [Measures].[Selector] AS SUM (

StrToSet(@DateCalendar), iif([Measures].[Reseller Sales Amount]>1000000, 1, NULL)

)

select non empty {[Measures].[Reseller Sales Amount],[Measures].[Reseller Order Quantity], selector} on 0,

non empty (Filter([Employee].[Employee].[Employee].Members, [Measures].[Selector]>0) * [Date].[Calendar].[Calendar Year]) on 1

from (select StrToSet(@DateCalendar) on 0 from [Adventure Works])

 

However, when you run the query you get the following error if you target Analysis Services 2005:

The DateCalendar parameter cannot be resolved because it was referenced in an inner subexpression.

Obviously, the built-in SSAS data provider has an issue parsing parameters in a calculated member. This issue is apparently fixed in SQL Server 2008 where the above query executes just fine.

Workaround: As a workaround with SQL Server 2005, try defining a set that references the parameter and use the set in the calculated members, as follows:

WITH SET [SchoolYears] AS STRTOSET(@DateCalendar)

MEMBER [Measures].[Selector] AS SUM (

[SchoolYears], iif([Measures].[Reseller Sales Amount]>1000000, 1, NULL)

)

select non empty {[Measures].[Reseller Sales Amount],[Measures].[Reseller Order Quantity], selector} on 0,

non empty (Filter([Employee].[Employee].[Employee].Members, [Measures].[Selector]>0) * [Date].[Calendar].[Calendar Year]) on 1

from (select StrToSet(@DateCalendar) on 0 from [Adventure Works])

Dundas Chart Controls for .NET Available

On Friday, Microsoft released chart controls for Microsoft .NET Framework 3.5. These are re-tooled Dundas .NET chart controls which are now baked in the .NET framework. This is important because you can add the same stunning Dundas chart features to your Windows Forms and ASP.NET applications without external dependencies or licensing fees. It also minimizes your learning curve, as the Dundas controls share the same object model across technologies, e.g. Windows Forms to ASP.NET to Reporting Services.

Currently, only the Dundas chart control has made the cut to .NET. There is a sticky post at the top of the forums that has all of the links. Keep an eye on the PDC announcements today for more BI goodies.

SSAS WHERE vs. Subselect Puzzle

I’ve been involved recently in a quest to optimize an SSAS 2008 long-running query that would return the 12-month history of a KPI. Since the KPI was displayed on a dashboard page alongside other KPIs and charts, it was important to optimize the query as much possible. Rephrased to Adventure Works, the query went something like this:

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select -{[Customer].[Customer].&[20075]} on 0 from [Adventure Works])

where {[Sales Territory].[Sales Territory Country].&[United States], [Sales Territory].[Sales Territory Country].&[United Kingdom]}

This query compares the sales for each customer with the customer’s sales for the previous month and returns the number of customers who have increased sales for the past 12 months. Also, the query uses a subselect to exclude a customer and a WHERE clause that restricts the results to USA and United Kingdom only. The real-world query would get the members to filter in the WHERE clause from the UI.

The first optimization technique to point out is that instead of filtering the customer set using the Filter function, the query benefits from Mosha’s summator trick, which alone improved the query performance about ten times. However, the query was still taking long to execute (some 30 seconds with set of 150,000 customers). Much to my surprise though, flipping the WHERE and the subselect clauses cut the query time in half.

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])

where -{[Customer].[Customer].&[20075]}

Here, the member that needs to be excluded was moved to the query WHERE clause, while the rest of the filter went to the subselect. While I don’t know what the server does exactly, my hypothesis is that the WHERE clause is more efficient in restricting the cube space before the rest of the query is executed. I wasn’t able to reproduce the performance gain with the Adventure Works cube probably because is too small or the issue was design specific. Larger cubes may benefit from the filter flip.

IMPORTANT  OOPS [:(]

As Mosha was quick to point out, in particular case where SUM is used, the attempt to exclude a given customer in both cases (subselect and WHERE) will not work. Specifically, filtering out the customer is a no-op, since the Customer.Customer.Customer inside Sum overwrites it. Consequently, the queries don’t exclude this customer from the computations at all and therefore are wrong. A great optimization without QA involvement, indeed :-). However, I still can’t explain why subselect is slower if both options are no-ops. BTW, if the exclude filter was done on another attribute (not Customer), then the subselect would produce the correct results, while the WHERE clause will still be a no-op.

In this particular case, the easiest way to exclude specific customers is to remove them from the set inside the SUM function, as the following query shows. This, of course, will impact the query performance. Based on my test, the member exclusion adds about 25% overhead to the query.

 

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum({[Customer].[Customer].[Customer] – [Customer].[Customer].&[20075]} , [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])

Report Builder 2.0 RTM’ed

Today, Microsoft released Report Builder 2.0 (build 10.00.1600.60) , as Robert Bruckner announced in his blog. This is a major milestone in the Reporting Services roadmap. In my opinion, Report Builder 2.0 is one of the most important and exciting features of Reporting Services 2008 as it bridges the gap between standard and end-user (ad hoc) reporting. You should definitely consider Report Builder and Analysis Services for your ad-hoc reporting projects.