DAX – Page 5 – Prologika

SSRS Tabular Query Designer

March 14, 2017/0 Comments/in Blog/by Prologika - Teo Lachev

Analysis Services Tabular has gained a lot of momentum for implementing organizational semantic models but its toolset has been lacking. SSRS developers had to rely previously on the MDX Query Designer to send MDX queries to Tabular. If DAX queries were preferred (and they often are for better performance since DAX is the native Tabular query language), developers had to type in DAX manually. And if the query would take parameters then the ugly workaround was to use … the DMX Query Designer.

As Chris Finlan announced yesterday, this will all change starting with the newly released Report Builder, SSDT 17 (currently in release candidate state), and SSMS 17 (also in RC state), as they now feature a Tabular Query Designer. SSMS also includes DAX Query Editor, as Christian Wade explains here.

If you’re familiar with the MDX Query Designer, you already know how to use the Tabular Query Designer. When you connect to Tabular, the query designer will discover it and default to DAX query mode. The dropdown allows you to switch to MDX should you prefer to send MDX queries to Tabular.

Similar to its MDX counterpart, in most cases you’d still need to know DAX to customize your queries. If you connect to Tabular version 2016 or above, you’d be able to drag and drop metadata to autogenerate the query. I think version 2016 or above is required because the designer relies heavily on the DAX SUMMARIZECOLUMNS function, which was introduced in 2016, when autogenerating queries.

The Design Mode lets you examine the underlying DAX query and customize it as needed. Sorry, no color coding or syntax checking in SSRS, so you’d probably rely on DAX Studio to code and test your queries or the new DAX Query Editor in SSMS which has syntax checking and IntelliSense. The Calculated Member option is disabled and you’re on your own defining custom DAX measures. No big surprises here is it works exactly in same way as in MDX (of course you need DAX formulas, such as VALUE to convert to a numeric value).

It was about time for the Tabular Query Designer to appear to help us writing DAX when designing traditional SSRS reports connected to Tabular. Watch out for a bug where auto-generated parameter datasets error out when hierarchies are used. I attach two reports. Report.rdl demonstrated an auto-generated DAX query. ReportEx.rdl demonstrates a customized DAX query.

Aggregates over Aggregates in DAX

July 15, 2016/0 Comments/in Blog/by Prologika - Teo Lachev

Sometimes, you might run into a scenario that requires an aggregate over an aggregate. Consider the following report:

The AvgOrderAmount has a simple calculation: SUM(Sales)/SUM(Qty). This calculation is applied uniformly. The Total line would divide 300/7 and will return 42.86. However, what is the user wants the result in the DesiredAvgOrderAmt column which produces a simple average over the details (50+40)/2. This could be accomplished by creating a base measure for the detail calculation:

DesiredAvgOrderAmtBase := DIVIDE ( SUM ( [Sales] ), SUM ( [Qty] ) )

Then, the DesiredAvgOrderAmt formula would be:

DesiredAvgOrderAmt :=
IF (
    HASONEFILTER ( Customer[Customer] ),
    [DesiredAvgOrderAmtBase],
    AVERAGEX ( VALUES ( Customer[Customer] ), [DesiredAvgOrderAmtBase] )
)

HASONEVALUE would return TRUE when the calculation is performed at the detail level and FALSE in the “grand totals”. In the latter case, it performs a simple average over the detail aggregates. Think of it as performing a second pass over the details to produce an aggregate over aggregates.

Power BI Quick Calculations

June 19, 2016/0 Comments/in Blog/by Prologika - Teo Lachev

DAX is very powerful but it could entail a steep learning curve for novice users. Power BI Desktop has started on the road of delivering pre-packaged calculations. The first one is Percent of Grand Total. Once you add a measure to the Values zone, you can expand the measure drop-down and click Quick Calc. This is similar to how you would pick one of the default aggregation functions.

Interestingly, Power BI Desktop doesn’t add a new calculation when the Quick Calc feature is used. Instead, it probably creates an implicit DAX measure and you can’t see the formula. The original measure is renamed to %GT <original measure name>.

Prepackaged calculations are a frequently requested Power BI feature. Expect additional quick calculations to light up soon. As a recommendation for improvement, I think that it would be nice to be able to see the DAX formula behind the quick calc.

Power BI Measure Dimensions

March 24, 2016/0 Comments/in Blog/by Prologika - Teo Lachev

I had an inquiry about how to implement in Power BI/Power Pivot/Tabular something similar to the new Level of Detail (LOD) Expressions feature in Tableau 9.0. A more generic question would be how to turn a measure into a dimension so that you can analyze your data by the distinct values of the measure. Now, in my opinion, most real-life requirements would go beyond just using the distinct values. For example, the first sample Tableau report demonstrates a banding example. But what if you want to group measure values into “buckets”, e.g. 1, 2, 3, 4, 5-10, 10-20, etc?

Fortunately, DAX has supported powerful expressions since its very beginning. To make the code more compact, the example below uses DAX variables, which were introduced in Power BI Desktop and Excel 2016. However, you don’t have to use variables if you use a lower version of Excel. The sample model (attached) has three tables: FactInternetSales, DimCustomer, and DimDate. The scenario is to analyze data by a NumberOrders band that discretizes the number of orders the customers have placed over time. The trick is to add a calculated column to DimCustomer which has the following DAX formula:

NumberOrders =
VAR SumOrders =
CALCULATE ( DISTINCTCOUNT ( FactInternetSales[SalesOrderNumber] ) )
RETURN
SWITCH (
TRUE (),
SumOrders <= 4, FORMAT ( SumOrders, “General Number” ),
SumOrders >= 5 && SumOrders <= 10, “5-10”,
SumOrders > 10, “More than 10”
)

This expression defines a SumOrders variable whose expression calculates the number of orders each customer has in FactInternetSales. Because the grain of FactInternetSales is a sales order line item, we need to use DISTINCTCOUNT. Then, for more compact syntax, instead of using a bunch of nested IF functions, we use the SWITCH function to evaluate the value of the SumOrders variable and create the bands. This is where the variable comes handy. If you don’t have a variable, you need to create a separate measure for CALCULATE ( DISTINCTCOUNT ( FactInternetSales[SalesOrderNumber] ) ) and use this measure so you don’t repeat the same formula in the SWITCH statement. Variables also has the advantage of evaluating the expression once so the expressions that reference them should perform better.

Now we can have a report that uses the measure dimension, such as to show sales by the number of orders.

DAX Editor Adds Support for Tabular Default Members

March 19, 2015/0 Comments/in Blog/by Prologika - Teo Lachev

UPDATE 4/24/2019 The new JSON-based Tabular schema doesn’t support extensions so Tabular Editor and BI Developer Extensions (BIDS) won’t work.

In my previous blog, I announced a few new features for DAX Editor for Tabular. Today, I checked in another change that adds support for default members. Currently, Tabular doesn’t have UI for defining default members. However, you can define default members in the MdxScript section of the BIM file using MDX syntax just like you can do so in Multidimensional.

The only issue is that if you make a change to any Tabular calculated measure in the designer, SSDT will regenerate the script and your manual changes will be lost. That is unless you use DAX Editor as its support for default members keeps them in the script. As a disclaimer, I took the backdoor approach for defining the default member syntax. The right approach would be to build upon the wonderful work of Nickolai Medveditskov and check the default member syntax so DAX Editor catches syntax errors as you type and when you attempt to save the measures. But that would have required a lot of work with Managed Babel. Moreover, the next version of SQL Server might support default members natively so all this work could be wasted.

So, I’ve decided to implement default members by allowing you to define a custom MDX script section in the *.dax file that is carried verbatim to the Tabular model without any syntax checking or changes. Of course, if you make an error, e.g. reference a column that doesn’t exist or use wrong syntax, you’ll find this pretty quickly when you attempt to open the *.bim file. To fix this, check your syntax in DAX Editor and save the changes to the *.bim file again. Or, open the *.bim file source and remove the custom MDX script which you’ll find inside the <MdxScript> element.

The syntax for default members is simple:

In the DAX file, add the following section at the top of the file. It actually can appear anywhere but I’d typically put it on top when working with cube scripts.
— MDX SCRIPT —
ALTER Cube CurrentCube Update Dimension [Product].[Is Finished Goods], Default_Member = [Product].[Is Finished Goods].&[TRUE];
ALTER Cube CurrentCube Update Dimension [Currency].[Currency Code], Default_Member = [Currency].[Currency Code].&[USD];
— MDX SCRIPT —

IMPORTANT Notice that the script block has to be enclosed with “— MDX SCRIPT –” comments exactly as they appear in the example. DAX Editor looks for this pair of comments to identify the custom script block.

Inside the comments, enter ALTER CUBE statements to set the default members using MDX syntax as you would do it in Multidimensional cube scripts. Notice the MDX UniqueMember syntax for the Default_Member property. In this case, I set two default members. Specifically, the Is Finished Goods column in the Product table is set to the TRUE member () and the Currency Code column in the Currency table is set to USD.
Click the “Save Measures to BIM File” button in the DAX Editor Toolbar to apply the DAX Editor changes to the *.bim file as usual.

I haven’t tested it yet but you should be able to add other MDX script commands to the DAX Editor script, such as KPI definitions, which DAX Editor doesn’t currently support.

Enjoy!

Projecting a Scalar Value with DAX EVALUATE Queries

March 4, 2015/0 Comments/in Blog/by Prologika - Teo Lachev

When you work on more complicated DAX measures, you should get out of the Power Pivot or BISM design environment and use the excellent DAXStudio (or SSMS) with the EVALUATE query syntax. But then eventually the measure would return a scalar value while EVALUATE requires a table. You can use the DAX ROW function to create a single-row, single-column table, and then show the result using EVALUATE.

Filtering Tables Dynamically in DAX

February 15, 2015/0 Comments/in Blog/by Prologika - Teo Lachev

Sometimes, your DAX calculated measures might need to filter a table dynamically based on a certain condition. For example, you might have a Type 2 table like this one:

ClaimID	ClaimDate	ClaimStatus	RowStartDate	RowEndDate
XXX-1	1/1/2015	Open	1/1/2015	1/15/2015
XXX-1	1/1/2015	Approved	1/16/2015	12/31/9999

When a change is detected on the existing claim, this design expires the previous row and adds a new one. A common business question would be “How many claims do we have as of a given date?” Tabular is more flexible than MD answering this question because you can directly filter the table. In your first attempt, you might write the following calculated measure:

ClaimCount:=CALCULATE(DISTINCTCOUNT([ClaimID]), [RowStartDate]<=[FilterLastDate] && [FilterLastDate]<=[RowEndDate]))

Where FilterLastDate is another calculated measure that returns the max date if the user has selected a date range, such as an entire month from a Date hiearchy, so that the measure is evaluated as the last date of the user selection. However, this attempt fails with the error “A function ‘CALCULATE’ has been used in a true/false expression that is used as a table filter expression. This is not allowed.” This article provides more context although I personally believe that the explanation doesn’t give all the details as the second argument is Boolean in this case. To fix this you need a more complicated expression that uses the FILTER function:

ClaimCount:=CALCULATE(DISTINCTCOUNT([ClaimID]), FILTER(ALL(ClaimSnapshot[ClaimDate], ClaimSnapshot[RowStartDate], ClaimSnapshot[RowEndDate]), [RowStartDate]<=[FilterLastDate] && [FilterLastDate]<=[RowEndDate]))

The FILTER function does the same filtering logic. If the table you filter on is related to a Date table, such as on ClaimDate, you need to ignore its context. Otherwise, the calculated measure scope will be limited by the Date selection. To do so, you need to use the ALL function but because it returns a table, the result needs to include all columns that you need to filter on.

DAX Variance Calculations

August 15, 2013/0 Comments/in Blog/by Prologika - Teo Lachev

Variance calculations are a common BI requirement but good reference material is lacking. How do we create them in DAX? Let’s say you need to calculate the variance between this month sales and the previous month sales.

DAX date functions require a Date table. They won’t work if you don’t have a separate Date table. They also require that you tell PowerPivot (or Tabular) about your Date table. So, as a first step select the Date table, go to the Design ribbon, and then click Mark as Date Table. PowerPivot will infer the column of the date data type (you must have such a column).

Sort the Month column with custom sort. For example, if your Month column returns the month in MMM-YY format, sort that column by another integer column in the format YYYYMM. More on this in a moment.
Define a new calculated measure using the following formula:

=SUM(ResellerSales[SalesAmount]) – CALCULATE(SUM(ResellerSales[SalesAmount]), PREVIOUSMONTH(‘Date'[Date]))

What’s going on here? The formula needs to work irrespective of the month on the report. In other words, if we put months on columns, each month will act as of that month. That’s why we need a calculated measure. The first SUM returns the aggregated sales for the month. The CALCULATE function is used to overwrite the context because we want to get to the last month sales. To do this, we pass the PREVIOUSMONTH function as a second argument and point it to use the Date column in the Date table as we do with other DAX date-related functions.

Here is something important. This report won’t work with just months on columns if your Month column returns the name of the month only. That’s because the month name will be repeating across years. This is why the PREVIOUSMONTH documentation insists on having both year and month on the report.

“To see how this works, create a PivotTable and add the fields, CalendarYear and MonthNumberOfYear, to the Row Labels area of the PivotTable.”

If you don’t want to add the year to reports, add a MonthName calculated column to your Date table with the following formula: = [MonthName] & ” ” & [CalendarYear] that makes the month unique across years and another column MonthNameSort to sort by with the following formula: =[CalendarYear] * 100 + [MonthNumberOfYear]. Then sort the first column by the second. To do this, in the PowerPivot window, click any cell in the MonthName column. Then, in the Home ribbon, click the Sort by Column button and choose the MonthNameSort as the sort by column. Use the MonthName column on the report, as I’ve done in the screenshot above.

Further considerations:

If you want to calculate a variance between parallel periods, use the DAX ParallelPeriod() function.
While a clumsy way exists in PowerPivot Tabular to handle multiple relative dates (see the “Calculating many time periods within a single measure formula” section in the DAX Samples white paper), the formulas must still reference a specific measure, e.g. SalesAmount. That’s because Tabular doesn’t currently support scope assignments that allow you to create generic “shell dimensions” spanning multiple measures.
What if you want the variance to work across months, quarters, and years? Because Tabular doesn’t support functions for hierarchy navigation and no PreviusPeriod() function exists, you have to resort to the DAX Samples workaround to adjust the formulas depending on what level you are in the Date hierarchy, such as to use PreviousMonth() if you are at the month level and PreviousYear() if you are at the year level.

Cube vs. VertiPaq Query Performance

July 28, 2011/0 Comments/in Blog/by Prologika - Teo Lachev

This is a big topic and frankly it’s too ambitious on my part to tackle it. Assuming equivalent multidimensional (BISM Multidimensional) and tabular (BISM Tabular) models, I was curious how a multidimensional cube fares against VertiPaq in terms of performance. To be fair to VertiPaq, I decided to use native DAX queries. As you’ve probably heard, BISM Tabular in SQL Denali will include a variant of DAX to query tabular models deployed to SharePoint and SSAS running in VertiPaq mode. Chris Webb has a good writeup about DAX queries here. The DAX EVALUATE construct allows external clients to query tabular models using native DAX syntax instead of MDX. Since BISM Tabular speaks DAX, DAX queries are likely to be more efficient and give you better performance when querying tabular models. At this point, only Crescent generates native DAX queries. The DAX query syntax is:

DEFINE

MEASURE Table1 [measure1] = <DAX_Expression>

MEASURE Table2 [measure2] = <DAX_Expression>

EVALUATE <DAX Table Expression>

ORDER BY

<DAX_Expression> [ASC | DESC]

START AT

Value_or_Parameter, Value_or_Parameter, …

To have a more sizable dataset, I used the Contoso cube for my tests. I created a BISM Tabular model and imported the Contoso data. Since you probably don’t have Contoso, I provide queries that target the Adventure Works cube. I’ve started with the following unoptimized MDX query which calculates the average sales amount by date across products whose daily sales exceed the daily sales for the same date in the previous month:

WITH

MEMBER [Measures].SlowAvg AS

Avg

(

Filter

(

[Product].[Product].[Product].MEMBERS

,[Measures].[Sales Amount] > ([Measures].[Sales Amount], ParallelPeriod([Date].[Calendar].[Month]))

)

,[Measures].[Sales Amount]

)

SELECT

[Measures].SlowAvg ON 0,

[Date].[Calendar].[Date].Members
ON 1

FROM [Adventure Works];

Then, I optimized the query to take advantage of block computation mode, as follows:

WITH

MEMBER diff as
iif ([Measures].[Sales Amount] > ([Measures].[Sales Amount], ParallelPeriod([Date].[Calendar].[Month])), [Measures].[Sales Amount], null)

MEMBER [Measures].SlowAvg AS

Avg

(

[Product].[Product].[Product].MEMBERS, diff

)

SELECT

[Measures].SlowAvg ON 0,

[Date].[Calendar].[Date].Members
ON 1

FROM [Adventure Works];

Finally, my equivalent DAX query that used to measure performance was:

define measure FactResellerSales[TotalSales] = Sum([SalesAmount])

measure FactResellerSales[TotalSales – LastYear] = [TotalSales](SamePeriodLastYear(DimDate[FullDateAlternateKey]), All(DimDate))

measure FactResellerSales[AverageSales] = AverageX(Filter(Values(DimProduct[ProductKey]), [TotalSales] > [TotalSales – LastYear]), [TotalSales])

evaluate addcolumns(filter(values(DimDate[DateKey]), not isblank([AverageSales])), “AverageSalesAmount”, [AverageSales])

order by [DateKey]

And, the findings from the tests:

1. MDX query un-optimized (cell calculation model) both on cold cache and executed second time – 33 sec

2. MDX query optimized (block computation mode) on cold cache – 4.8 sec

3. MDX query optimized (block computation mode) executed second time – 0.7 sec

4. DAX query both on cold cache and executed second time – 6.4 sec

Here are some take-home notes:

The fact that VertiPaq is an in-memory database doesn’t mean that it will perform much better than a multidimensional cube. The formula engine of BISM Multdimensional does cache query results in memory. So does the Windows OS. In fact, the more the cube is used, the higher the chances that its data will end up in memory.
VertiPaq might give you good performance without special tuning. All DAX calculations run in a block computation mode.
Optimized MDX queries might outperform VertiPaq especially if results are cached.
DAX queries are never cached which explains why DAX queries perform the same when executed subsequently.

The fact that VertiPaq gives you a head start doesn’t mean that you cannot write inefficient DAX queries. For example, the following DAX measure definition returns the same results but it’s twice as slow.

measure FactResellerSales[AverageSales] = AverageX(Filter(AddColumns(Values(DimProduct[ProductKey]), “x”, [TotalSales]), [x] > [TotalSales – LastYear]), [x])

Again, this is an isolated test case and your mileage might vary greatly depending on queries, data volumes, hardware, etc. But I hope you could use it as a starting point to run your own tests while waiting for a VertiPaq performance guide.

Posts