Posts

Excel PivotTable Tabular Reports

Here is a great tip thanks to Greg Galloway who pointed me to the Steve Novoselac’s excellent blog. By default, Excel 2007 PivotTable stacks (hierachizes) the attributes on the report even if they are from the same dimension. For example, let’s say the user wants to slice by the product hierarchy but want also some additional product attributes, such as the product color, size, etc. By default, Excel will produce the following layout:

033109_2111_ExcelPivotT1

Needless to say, this is hardly what the users want to see as they would prefer the product attributes to be displayed in a tabular format. Luckily, this is easy albeit not very intuitive.

  1. Before you drop the first field on PivotTable, click the Design main menu on the ribbon (the one next to Options menu).

033109_2111_ExcelPivotT2

  1. Expand the Report Layout drop-down menu and select Show in Tabular Form. Again, make sure that that PivotTable is empty before you do this.
  2. If you want to remove subtotals, expand the Subtotals drop-down menu and select Do Not Show Subtotals.
  3. Finally, to remove the +/- indicators, right-click on PviotTable and select PivotTable Options. On the Display tab, uncheck the Show Expand/Collapse Buttons.

Then, you can proceed to laying out the report as usual. Now you have a cool-looking tabular report.

033109_2111_ExcelPivotT3

Measures on Rows

Issue: You want an MDX query that returns measures on rows and calculated members on columns. You get The Measures hierarchy already appears in the Axis0 axis error. That’s because calculated members added to measures are treated as measures and you cannot have the same dimension (Measures in this case) on different axes.

Solution: Let’s point out that the above-mentioned error doesn’t occur if you request dimensions on columns. For example, the following is a perfectly legit query:

select {[Date].[Calendar].[Calendar Year].&[2002],

[Date].[Calendar].[Calendar Year].&[2003]} on 0,

{[Measures].[Internet Sales Amount],

[Measures].[Reseller Sales Amount],

[Measures].[Sales Amount]

} on 1

from [Adventure Works]

031409_0126_MeasuresonR1

In addition, you may be able to use the capabilities of the tool to rotate dimensions from columns to rows, such as using a crosstab report layout in Reporting Services. Interestingly, PivotTable in Excel 2007 lets you add measures on rows although behind the scenes Excel requests measures on columns but does an internal rotation to move the measure columns on rows.

However, sometimes you may need asymmetric columns that filter the results. This may force you to use calculated members on columns and measures on rows. Since you cannot put the calculated members on Measures, consider adding them to a dimension. The following query adds Column1 and Column2 calculated members to the Product.Category attribute. Notice that the first column filters the bikes sales for USA only.

with member [Product].[Category].Column1 as ([Product].[Category].[Bikes], [Date].[Calendar].[Calendar Year].&[2002], [Geography].[Country].&[United States])

member [Product].[Category].Column2 as ([Product].[Category].[Bikes], [Date].[Calendar].[Calendar Year].&[2003])

select {[Product].[Category].Column1, [Product].[Category].Column2 } on 0,

{[Measures].[Internet Sales Amount],

[Measures].[Reseller Sales Amount],

[Measures].[Sales Amount]

} on 1

from [Adventure Works]

Of course, you can use any tuple or aggregation function for the member definition. BTW, you can cheat SSRS to bypass the rule that insists that measures must be on columns and the entire validation and preparation goodness by manually entering the query in the report definition and defining the dataset fields and mappings.

031409_0126_MeasuresonR2

Invisible Cubes

I had a head-scratcher the other day. All of a sudden, SQL Server Management Studio stopped showing the cubes in the dropdown of the MDX Query tab although MDX queries and reports would execute just fine. The Reporting Services MDX Query Designer would complain with “No cubes found” error when I attempted to launch it from the dataset properties. This strange behavior coincided with installing SQL Server 2008 SP1 CTP, so I was quick to find a culprit. My suspicion was that the SSAS Windows authentication was failing was some reason although the SQL Server Profiler would show that the right Windows identity was connecting to the database.

To troubleshoot this further, I issued the following schema rowset query when connected to the SSAS database in SSMS:

select * from $system.mdschema_cubes

Interestingly, the query would return a list of the database dimensions (SSAS treats dimensions as cubes and prefix the dimension name with $) but it wouldn’t return the cube name itself. Then, I had a Eureka moment. I went to the Cube Structure tab in the BIDS Cube Designer, and I realized that that somehow I managed to set the cube Visible property to False. Not sure why you would ever want to make the cube invisible and why this property is there to start with but there it was. Flipping it back to True and redeploying the cube fixed the mysterious No Cubes Found error.

Hacking MDX Query Designer Parameters

[View:/CS/cfs-file.ashx/__key/CommunityServer.Blogs.Components.WeblogFiles/blog/simpleparameters.zip]Continuing our intrepid journey in the land of SSRS-SSAS integration, after we’ve figured out how to retrieve the cube metadata, we are ready to tackle another unchartered territory: query parameters.

Issue: You need to parameterize the MDX query by a measure or a simple parameter. Take for example the following report (attached).

022609_0224_HackingMDXQ1

This report shows the Internet Sales Amount measure broken down by the Employees parent-child hierarchy. What’s more interesting is the Show Descendants parameter. It lets you filter the report for a single employee (Show Descendants = False) or the descendants of the selected employee including the selected employee (Show Descendants = True). So, what’s the issue? You can whip out such a report in no time if you source data from a relational database. However, the MDX Query Designer insists that the parameter is resolved to a single dimension member or a set of dimension members. Apparently, simple parameters or parameterizing by measures were considered uncommon scenarios. So, we need a sleek hack.

Solution: Luckily, there is a way to gain more control over the MDX query parameters but the design mode (the one that lets you drag and drop stuff) won’t cut it. So, it may make sense to author the report query in design mode as much as you can before you jump into the cold waters of MDX query mode because there is no going back to drag and drop, sorry.

  1. To start with, you need to define a query parameter for the ShowDescendants parameter. Switch to MDX mode and click the Query Parameters toolbar button (enabled only in MDX mode).
  2. Add a new ShowDescendants “hanging” parameter that doesn’t reference any dimension. Make sure that the query parameter name matches the report parameter name. Otherwise, the MDX Query Designer will treat the parameter as a new parameter and it won’t associate it to the existing ShowDescendants report parameter. Note that the parameter name is case-sensitive.

022609_0224_HackingMDXQ2

  1. Leave the Dimension and Hierarchy columns empty. Luckily, they are not required in MDX mode. Either type in a measure name, such as [Measures].[Sales Summary], or enter DEFAULT in the Default column. In the latter case, the default cube measure will be used when you test the query in the MDX Query Designer. See this blog by Chris Webb for ideas about how to speed up the default measure if needed. Click OK.
  2. Back to the query, change it as follows:

SELECT NON EMPTY { [Measures].[Reseller Tax Amount] } ON COLUMNS, NON EMPTY

IIF(@ShowDescendants=False, StrToMember(@EmployeeEmployees), DESCENDANTS(StrToMember(@EmployeeEmployees)))

DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME, PARENT_UNIQUE_NAME, LEVEL_NUMBER ON ROWS FROM [Adventure Works] CELL PROPERTIES VALUE, BACK_COLOR, FORE_COLOR, FORMATTED_VALUE, FORMAT_STRING, FONT_NAME, FONT_SIZE, FONT_FLAGS 

These changes are specific to the report requirements. In our case, the IIF expression checks the value of the ShowDescendants parameter. If the user has selected the False option, the query brings only the selected employee on rows. Otherwise, we use the Descendants function to bring the descendants of the selected employee.

Looks hacky? Join me and make a wish on connect.microsoft.com to improve the SSRS-SSAS integration. Don’t be surprised if you are not active that the MDX Query Designer will continue to sneak from one version to another unchanged.

Retrieving Cube Metadata in Reports

An interesting question popped up on the discussion list the other day about how to retrieve a list of the cube measures so the report author can display them in a report parameter.

Solution 1: If you just need a list of measures but not the measure metadata, such as display folder, etc., you can use the following (not trivial, ok hacky) query, which I borrowed from my esteemed colleague MVP and MDX guru Chris Webb:

WITH SET MYSET AS HEAD([Date].[Date].[Date].MEMBERS, COUNT(MEASURES.ALLMEMBERS)-1)

MEMBER MEASURES.MeasureUniqueName AS  MEASURES.ALLMEMBERS.ITEM(RANK([Date].[Date].CURRENTMEMBER, MYSET)-1).UNIQUENAME

MEMBER MEASURES.MeasureDisplayName AS MEASURES.ALLMEMBERS.ITEM(RANK([Date].[Date].CURRENTMEMBER, MYSET)-1).NAME

SELECT {MEASURES.MeasureUniqueName, MEASURES.MeasureDisplayName} ON 0,

MYSET ON 1

FROM [Adventure Works]

 

Solution 2: This approach is especially useful if you need the measure properties. In this case, you must use the OLE DB Provider for Analysis Services instead of the SSRS Analysis Services Provider (the one that gives you the nice MDX Designer). This approach uses the MDSCHEMA_MEASURES rowset, which is one of the built-in SSAS schema rowsets. Darren Gosbell provides a great overview of the SSAS rowsets here and Vidas Matelis gives more insights here. Thanks to the schema rowsets, getting a list of measures can’t simpler.

SELECT * from $System.MDSCHEMA_MEASURES

The user will require Read Definition rights (on Role Properties General tab, check Read Definition) to obtain the metadata. If the user doesn’t have this right, an empty dataset will be returned.

There are of course additional schema rowsets, such as MDSCHEMA_KPIS if you want to get a list of all KPIs or DBSCHEMA_TABLES in case you want to discover the cube dimensions and measure groups. BOL provides a full list here. Aren’t SSAS schema rowsets cool?

 

Named Set Subtotals in Excel

So many SSAS browsers, so little resemblance… And each browser has an independent view of which SSAS features it cares about and how it will go about implementing them. For example, Excel 2007 will let you use named sets on rows or columns but not as a filter. The cube browser and Report Builder let you use them in the filter area but not on columns and rows.

Issue: Business users have requested a subtotal for named set in Excel 2007. Say, you’ve defined the following named set in the cube to return the last 12 months from the current month.

CREATE DYNAMIC SET CURRENTCUBE.[Last 12 Months] AS

[Date].[Date Hierarchy].CurrentMember.Lag(11):[Date].[Date Hierarchy].CurrentMember

 

If you drop this named set on an Excel 2007 PivotTable report, you would notice that it doesn’t give you a subtotal that returns the aggregated total of the measure used in conjunction with the set. However, the chances are that you would need a server subtotal, especially if you use scope assignments to derive the subtotal value. By contrast, the cube browser will show the subtotal if you use the named set as a filter and drag the corresponding dimension (in this case Date) to the rows or columns. This difference of course originates from the MDX query the client generates. In the latter case, the cube browser hiearchizes the dimension All member so the server returns this member, while a named set would simply returns the members of the set and nothing more.

Workaround: A simple workaround in Excel 2007 to get a subtotal is to add the dimension all member, such as:

CREATE DYNAMIC SET CURRENTCUBE.[Last 12 Months] AS

{[Date].[Date Hierarchy].CurrentMember.Lag(11):[Date].[Date Hierarchy].CurrentMember,
[Date].[Date Hierarchy].[All]}

Of course, if the named set uses another dimension, you need to add the All member of that dimension. For example, adding a subtotal to the Adventure Works Top 50 Customers will require the following change:

 

Create Dynamic Set CurrentCube.[Top 50 Customers]

As {TopCount

(

[Customer].[Customer].[Customer].Members,

50,

[Measures].[Internet Sales Amount]

), [Customer].[Customer].[All Customers]}

 

Wish List: Unify Microsoft-based browsers as much possible in terms of functionality. In this case, it will be nice, if Excel supports named sets as filters.

Implementing Cube Settings

I had to tackle an interesting requirement the other day. A cube had a number of KPIs defined but the threshold values that were used to define the “good”, “bad”, and “so-so” status ranges had to be defined as configurable values at any level of the organization dimension and KPIs. I settled on implementing a measure group that intersects the organization dimension and a new KPI dimension that lists the KPI names.

011409_0257_Implementin1

For example, as the screenshot shows, each KPI has two threshold values (Threshold Value1 and Threshold Value2) which can be defined at any level of the organization hierarchy. Note that the parent values read the setting values directly from the fact table instead of being derived by rolling up from the children.

The first implementation approach that comes to mind is to use the None AggregateFunction function for the measure group so you read the measure values associated with the member directly from the fact table. Some experimenting revealed the following “noble” truths about the None AggregateFunction:

    1. BOL is wrong. I refer to the following statement:
      (None AggregateFunction) No aggregation is performed, and all values for leaf and nonleaf members in a dimension are supplied directly from the fact table for the measure group that contains the measure. If no value can be read from the fact table for a member, the value for that member is set to null.

Specifically, When None is used, the server loads the fact data into the measure group leaves (just like with other aggregate function) but it doesn’t roll up the values to the dimension leaves. See this Mosha’s post to understand the difference between the dimension and measure group leaves.

  1. The way it stands, the None function brings a questionable value. A narrow case exists where you may want to perform assignments or calculations in the measure group leaves as the above post demonstrates. Why you would want to do this instead performing them at the fact table level (or as named calculations) for obvious performance advantages is beyond me.
  2. There is no way to bring the measure leaf values to the dimension leaves. Shoot me a note if you manage to come with some clever hack to do so.

Note: It would have been great if the None function behaves as described in the documentation. Even better, it could be nice if the server automatically brings up the fact value when the cube slice results in one fact record, i.e. the user has drilled down to measure group grain. This will make it possible to implement text-based measures, such to capture comments, or other text-based attributes, that don’t justify moving them to a dimension. If you agree, vote for this enhancement on connect.microsoft.com.

So, how do we get around this predicament? One option is define a custom ~ operator for the Organization parent-child dimension so the server doesn’t roll up the measure. However, the custom operator will be applied to all measures sliced by the Organization dimension which more than likely will be an issue. Of course, a trivial workaround is to duplicate the Organization dimension but this leads to other disadvantages. IMO, a better solution might be to use the DataMember property and a scope assignment:

SCOPE (

MeasureGroupMeasures(“KPI Configuration”),

[Organization].[Organization Hierarchy].Members,

[KPI].[KPI].[KPI]

);

this = ([Organization].[Organization Hierarchy].CurrentMember.DataMember);

END SCOPE;

I scope at the KPI Configuration measure group where it intersects with the members of the Organization and KPI dimensions. The AggregationFunction property of the measures in the KPI Configuration measure group is now set to SUM. The tuple on the right side of the this assignment returns the measure value associated directly with the current data member. The new result is that this assignment overwrites the SUM function with the measure leaf value.

Dimensionalizing Pivoted Measures

Issue

You have a table with pivoted measures, such as:

121608_0043_Dimensional1

Use the following statement to create and populate the table:

CREATE
TABLE pvt
(VendorID int, DateID int, MaturityAmount1 int, MaturityAmount2 int,

MaturityAmount3 int, RepricingAmount1 int, RepricingAmount2 int, RepricingAmount3 int);

GO

INSERT
INTO pvt VALUES (1,1,4,3,5,4,4, 100);

INSERT
INTO pvt VALUES (2,1,4,1,5,5,5, 100);

INSERT
INTO pvt VALUES (1,2,4,3,5,4,4, 200);

INSERT
INTO pvt VALUES (2,2,4,1,5,5,5, 200);

INSERT
INTO pvt VALUES (3,1,4,3,5,4,4, 300);

INSERT
INTO pvt VALUES (3,2,4,3,5,4,4, 400);

INSERT
INTO pvt VALUES (3,3,4,3,5,4,4, 100);

INSERT
INTO pvt VALUES (4,1,4,2,5,5,4, 100);

INSERT
INTO pvt VALUES (5,1,5,1,5,5,5, 200);

GO

Let’s say you need a way to unpivot the results in the following format which is more suitable for OLAP:

121608_0043_Dimensional2

In this way, you eliminate the repeating measure columns with a single column. Here, the BucketID column is introduced so you could join the results to a dimension to let the user cross-tab the results on columns.

Solution

Usually, ETL processes would take care of transforming data in format suitable for analysis especially with large data volumes. However, with smaller fact tables and an OLAP layer directly on top of the operational database, such as having SQL views directly on top of the source database, you may need an ad-hoc approach to unpivot the results. Assuming you are fortunate to have SQL Server 2005 or above, you can use the SQL Server UNPIVOT function to unpivot on multiple columns. The following query does the trick:

SELECT VendorID, DateID,
CAST(REPLACE(Maturity,
‘MaturityAmount’,
)
as
int)
as BucketID,

MaturityAmount, RepricingAmount

FROM


(SELECT VendorID, DateID, MaturityAmount1, MaturityAmount2, MaturityAmount3,

RepricingAmount1, RepricingAmount2, RepricingAmount3 FROM pvt) p

UNPIVOT


(MaturityAmount FOR Maturity IN
(MaturityAmount1, MaturityAmount2, MaturityAmount3))
AS UnPivotedMaturity

UNPIVOT


(RepricingAmount FOR Repricing IN
(RepricingAmount1, RepricingAmount2, RepricingAmount3))
AS UnPivotedRepricing

WHERE
REPLACE(Maturity,
‘Maturity’,
)
=
REPLACE(Repricing,
‘Repricing’,
)

Notice that a WHERE clause is needed to join the two unpivoted sets so you don’t end up with a Cartesian set. The net result is that the number of rows in the unpivoted dataset is:

Number of Rows in Pivoted (original) Table * Number of Repeated Columns

So, in our case, the dataset will have 9 * 3 = 27 rows.

Referencing Parameters in SSRS Calculated Members

Issue: You author an SSRS report that connects to a cube. You want to create a calculated member that uses a report parameter. For example, the Selector calculated member in the following query returns 1 if a given employee has exceeded 1 million in sales for the dates specified by the user in the report parameter.

WITH MEMBER [Measures].[Selector] AS SUM (

StrToSet(@DateCalendar), iif([Measures].[Reseller Sales Amount]>1000000, 1, NULL)

)

select non empty {[Measures].[Reseller Sales Amount],[Measures].[Reseller Order Quantity], selector} on 0,

non empty (Filter([Employee].[Employee].[Employee].Members, [Measures].[Selector]>0) * [Date].[Calendar].[Calendar Year]) on 1

from (select StrToSet(@DateCalendar) on 0 from [Adventure Works])

 

However, when you run the query you get the following error if you target Analysis Services 2005:

The DateCalendar parameter cannot be resolved because it was referenced in an inner subexpression.

Obviously, the built-in SSAS data provider has an issue parsing parameters in a calculated member. This issue is apparently fixed in SQL Server 2008 where the above query executes just fine.

Workaround: As a workaround with SQL Server 2005, try defining a set that references the parameter and use the set in the calculated members, as follows:

WITH SET [SchoolYears] AS STRTOSET(@DateCalendar)

MEMBER [Measures].[Selector] AS SUM (

[SchoolYears], iif([Measures].[Reseller Sales Amount]>1000000, 1, NULL)

)

select non empty {[Measures].[Reseller Sales Amount],[Measures].[Reseller Order Quantity], selector} on 0,

non empty (Filter([Employee].[Employee].[Employee].Members, [Measures].[Selector]>0) * [Date].[Calendar].[Calendar Year]) on 1

from (select StrToSet(@DateCalendar) on 0 from [Adventure Works])

SSAS WHERE vs. Subselect Puzzle

I’ve been involved recently in a quest to optimize an SSAS 2008 long-running query that would return the 12-month history of a KPI. Since the KPI was displayed on a dashboard page alongside other KPIs and charts, it was important to optimize the query as much possible. Rephrased to Adventure Works, the query went something like this:

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select -{[Customer].[Customer].&[20075]} on 0 from [Adventure Works])

where {[Sales Territory].[Sales Territory Country].&[United States], [Sales Territory].[Sales Territory Country].&[United Kingdom]}

This query compares the sales for each customer with the customer’s sales for the previous month and returns the number of customers who have increased sales for the past 12 months. Also, the query uses a subselect to exclude a customer and a WHERE clause that restricts the results to USA and United Kingdom only. The real-world query would get the members to filter in the WHERE clause from the UI.

The first optimization technique to point out is that instead of filtering the customer set using the Filter function, the query benefits from Mosha’s summator trick, which alone improved the query performance about ten times. However, the query was still taking long to execute (some 30 seconds with set of 150,000 customers). Much to my surprise though, flipping the WHERE and the subselect clauses cut the query time in half.

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum([Customer].[Customer].[Customer], [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])

where -{[Customer].[Customer].&[20075]}

Here, the member that needs to be excluded was moved to the query WHERE clause, while the rest of the filter went to the subselect. While I don’t know what the server does exactly, my hypothesis is that the WHERE clause is more efficient in restricting the cube space before the rest of the query is executed. I wasn’t able to reproduce the performance gain with the Adventure Works cube probably because is too small or the issue was design specific. Larger cubes may benefit from the filter flip.

IMPORTANT  OOPS [:(]

As Mosha was quick to point out, in particular case where SUM is used, the attempt to exclude a given customer in both cases (subselect and WHERE) will not work. Specifically, filtering out the customer is a no-op, since the Customer.Customer.Customer inside Sum overwrites it. Consequently, the queries don’t exclude this customer from the computations at all and therefore are wrong. A great optimization without QA involvement, indeed :-). However, I still can’t explain why subselect is slower if both options are no-ops. BTW, if the exclude filter was done on another attribute (not Customer), then the subselect would produce the correct results, while the WHERE clause will still be a no-op.

In this particular case, the easiest way to exclude specific customers is to remove them from the set inside the SUM function, as the following query shows. This, of course, will impact the query performance. Based on my test, the member exclusion adds about 25% overhead to the query.

 

WITH MEMBER increaser AS Iif([Measures].[Internet Sales Amount]>([Measures].[Internet Sales Amount], [Date].[Calendar].PrevMember),1,NULL)

MEMBER CustomersWithIncrease as Sum({[Customer].[Customer].[Customer] – [Customer].[Customer].&[20075]} , [Measures].[increaser])

SELECT [Measures].[CustomersWithIncrease] on 0,

ParallelPeriod ([Date].[Calendar].[Month], 11, [Date].[Calendar].[Month].&[2003]&[12]) :[Date].[Calendar].[Month].&[2003]&[12] on 1

from (select {[Sales Territory].[Sales Territory Country].&[United States],

[Sales Territory].[Sales Territory Country].&[United Kingdom]}

on 0 from [Adventure Works])