Prologika Forums
Making sense of data
Cube vs. VertiPaq Query Performance

Blogs

Prologika (Teo Lachev's Weblog)

Books

Applied Microsoft SQL Server 2012 Analysis Services (Tabular Modeling)Learn PowerPivot and Analysis Services-Tabular at your own pace with our latest book Applied Microsoft SQL Server 2012 Analysis Services (Tabular Modeling). It is designed as an easy-to-follow guide for learning how to implement BI solutions spanning the entire personal-team-organizational BI spectrum.

Training

Applied Microsoft SQL Server 2008 Reporting Services

We offer onsite and online Business Intelligence classes!  Contact us about in-person training for groups of five or more students.

For more information or to register click here! 

Syndication

This is a big topic and frankly it's too ambitious on my part to tackle it. Assuming equivalent multidimensional (BISM Multidimensional) and tabular (BISM Tabular) models, I was curious how a multidimensional cube fares against VertiPaq in terms of performance. To be fair to VertiPaq, I decided to use native DAX queries. As you've probably heard, BISM Tabular in SQL Denali will include a variant of DAX to query tabular models deployed to SharePoint and SSAS running in VertiPaq mode. Chris Webb has a good writeup about DAX queries here. The DAX EVALUATE construct allows external clients to query tabular models using native DAX syntax instead of MDX. Since BISM Tabular speaks DAX, DAX queries are likely to be more efficient and give you better performance when querying tabular models. At this point, only Crescent generates native DAX queries. The DAX query syntax is:

DEFINE

MEASURE Table1 [measure1] = <DAX_Expression>

MEASURE Table2 [measure2] = <DAX_Expression>

EVALUATE <DAX Table Expression>

ORDER BY

    <DAX_Expression> [ASC | DESC]

    <DAX_Expression> [ASC | DESC]

START AT

    Value_or_Parameter, Value_or_Parameter, ...

 

To have a more sizable dataset, I used the Contoso cube for my tests. I created a BISM Tabular model and imported the Contoso data. Since you probably don't have Contoso, I provide queries that target the Adventure Works cube. I've started with the following unoptimized MDX query which calculates the average sales amount by date across products whose daily sales exceed the daily sales for the same date in the previous month:

 

WITH

MEMBER [Measures].SlowAvg AS

Avg

(

Filter

(

[Product].[Product].[Product].MEMBERS

,[Measures].[Sales Amount] > ([Measures].[Sales Amount], ParallelPeriod([Date].[Calendar].[Month]))

)

,[Measures].[Sales Amount]

)

SELECT

[Measures].SlowAvg ON 0,

[Date].[Calendar].[Date].Members ON 1

FROM [Adventure Works];

 

Then, I optimized the query to take advantage of block computation mode, as follows:

 

WITH

MEMBER diff as iif ([Measures].[Sales Amount] > ([Measures].[Sales Amount], ParallelPeriod([Date].[Calendar].[Month])), [Measures].[Sales Amount], null)

MEMBER [Measures].SlowAvg AS

Avg

(

[Product].[Product].[Product].MEMBERS, diff

)

SELECT

[Measures].SlowAvg ON 0,

[Date].[Calendar].[Date].Members ON 1

FROM [Adventure Works];

 

Finally, my equivalent DAX query that used to measure performance was:

 

define measure FactResellerSales[TotalSales] = Sum([SalesAmount])

measure FactResellerSales[TotalSales - LastYear] = [TotalSales](SamePeriodLastYear(DimDate[FullDateAlternateKey]), All(DimDate))

measure FactResellerSales[AverageSales] = AverageX(Filter(Values(DimProduct[ProductKey]), [TotalSales] > [TotalSales - LastYear]), [TotalSales])

evaluate addcolumns(filter(values(DimDate[DateKey]), not isblank([AverageSales])), "AverageSalesAmount", [AverageSales])

order by [DateKey]

 

And, the findings from the tests:

1.       MDX query un-optimized (cell calculation model) both on cold cache and executed second time – 33 sec

2.       MDX query optimized (block computation mode) on cold cache – 4.8 sec

3.       MDX query optimized (block computation mode) executed second time – 0.7 sec

4.       DAX query both on cold cache and executed second time – 6.4 sec

 

Here are some take-home notes:

  1. The fact that VertiPaq is an in-memory database doesn't mean that it will perform much better than a multidimensional cube. The formula engine of BISM Multdimensional does cache query results in memory. So does the Windows OS. In fact, the more the cube is used, the higher the chances that its data will end up in memory.
  2. VertiPaq might give you good performance without special tuning. All DAX calculations run in a block computation mode.
  3. Optimized MDX queries might outperform VertiPaq especially if results are cached.
  4. DAX queries are never cached which explains why DAX queries perform the same when executed subsequently.

 

The fact that VertiPaq gives you a head start doesn't mean that you cannot write inefficient DAX queries. For example, the following DAX measure definition returns the same results but it's twice as slow.

 

measure FactResellerSales[AverageSales] = AverageX(Filter(AddColumns(Values(DimProduct[ProductKey]), "x", [TotalSales]), [x] > [TotalSales - LastYear]), [x])

 

Again, this is an isolated test case and your mileage might vary greatly depending on queries, data volumes, hardware, etc. But I hope you could use it as a starting point to run your own tests while waiting for a VertiPaq performance guide.


Posted Wed, Jul 27 2011 8:50 PM by Teo Lachev
Filed under: , ,