Posts

Prologika Newsletter Spring 2024

One of the main goals and benefits of a semantic model is to centralize important business metrics and KPIs, such as Revenue, Profit, Cost, and Margin. In Power BI, we accomplish this by crafting and reusing DAX measures. Usually, implementing most of these metrics is straightforward. However, some might take significant effort and struggle, such as metrics that work at aggregate level. In an attempt to simplify such scenarios, the February 2024 release of Power BI Desktop includes a preview of visual calculations that I’ll review in this newsletter.

What’s a Visual Calculation?

As its name suggests, a visual calculation is a visual-scoped DAX measure that works at the aggregate (visual) level. Don’t confuse the term “calculation” here with calculated columns, tables, or groups. Replace “calculation” with “measure” and you will be fine. Consider the following matrix:

Suppose you need a measure that calculates the difference between the product categories in the order they are sorted in the visual. Implementing this as a regular DAX measure is a challenge. Yet, if we had a way to work with the cells in the visual, we can easily find a way to get this to work. Ideally, this would work similar in Excel, but DAX doesn’t know about relative cell references. However, visual calculations kind of do.

Let’s right-click on the visual and select “New calculation”. Alternatively, click the visual and then click the “New calculation” ribbon button in the Home ribbon. In the visual-level DAX formula bar, enter the following formula:

Category Diff = [Sum of SalesAmount] - PREVIOUS([Sum of SalesAmount], COLUMNS)

The Category Diff measure computes the difference between the current “cell” and the cell for the previous category. For example, for Bikes it will be $9,486,776-$28,266 (Accessories value).

The Good

As you can see, visual calculations can simplify aggregate-level metrics. Notice the use of the PREVIOUS function which is one the new DAX functions specifically designed for visual calculations. Notice also that the PREVIOUS function has additional arguments and one of them is the AXIS argument which specifies if the function should evaluate cells positionally on columns or rows. Traditional DAX functions don’t support the AXIS arguments because they are generic and not designed to operate on a visual level.

Finally, notice that visual-level formulas can only reference fields or measures placed in the visual. This is important as you’ll see later.

The Bad

Among the various limitations, the following will cause some pain and suffering:
1. You can’t export the data of a visual that has a visual calculation.
2. Drillthrough is disabled.
3. Can format visual calculations unless the visual supports it (the matrix visual does support measure formatting).
4. Can’t apply conditional formatting.
5. Can’t change sort order.
6. Can’t use field parameters.

The Ugly

The way Microsoft advertises this feature is that it is “easier than regular DAX” and performs better. My concern is that tempted by these promises, users will abuse visual calculations, such as for creating metrics that should be implemented as regular DAX measures so that any visual can benefit from them. And not before long, such users would find themselves into an Excel-like spreadmart hell which is what they tried to avoid by embracing Power BI.

I see a similar issue with DAX implicit measures, which Power BI users create by dragging a field and dropping it on a visual. I consider them a bad practice for a variety of reasons. Microsoft apparently doesn’t share the same concern (see their comments to my LinkedIn post on the same subject here). To their point, because visual calculations can only “see” fields placed in the visual, they naturally shield the user from abusing them. Let’s give it some time and see who’s right. Meanwhile, I wish this documentation article provides best practices and guidance on when to use implicit, explicit, and visual measures, including their limitations.

Visual calculations are incredibly useful but for limited scenarios. Therefore, please use these visual calculations only when regular DAX measures will not suffice. Business metrics should be centralized and should return consistent results, no matter the reporting tool or visual they are placed in. This is important for achieving the elusive single version of truth.


Teo Lachev
Prologika, LLC | Making Sense of Data
logo

A First Look at DAX Visual Calculations: the Good, the Bad, and the Ugly

The February 2024 release of Power BI Desktop includes a preview of visual calculations. As its name suggests, a visual calculation is a visual-scoped DAX measure that works at the aggregate (visual) level.

The Good

Visual calculations make previously difficult tasks much easier. Consider the following matrix:

Suppose you need a measure that calculates the difference between the product categories in the order they were sorted in the visual. Implementing this as a regular DAX measure is a challenge. Yet, if we had a way to work with the cells in the visual, we can easily find a way to get this to work. Ideally, this would work similar in Excel, but DAX doesn’t know about relative references. However, visual calculations do (kind of).

Let’s right-click on the visual and select “New calculation”. In the visual-level DAX formula bar, enter the following formula:

Category Diff = [Sum of SalesAmount] - PREVIOUS([Sum of SalesAmount], COLUMNS)

The Category Diff measure computes the difference between the current “cell” and the cell for the previous category. For example, for Bikes it will be $9,486,776-$28,266 (Accessories value).

Notice the use of the PREVIOUS function which is one the new DAX functions specifically designed for visual calculations. Notice also that the PREVIOUS function has additional arguments and one of them is the AXIS argument which specifies if the function should evaluate cells positionally on columns or rows. Finally, notice that visual-level formulas can only reference fields or measures placed in the visual.

The Bad

Among the various limitations, the following will cause some pain and suffering:
1. A visual calculation effectively disables exporting the visual data.
2. Drillthrough is disabled.
3. Can format visual calculations unless the visual supports it (the matrix visual does support measure formatting).
4. Can’t apply conditional formatting.
5. Can’t change sort order.
6. Can’t use field parameters.

The Ugly

The way Microsoft advertises this feature is that it is “easier than regular DAX”. My concern is that tempted by that promise, users will start abusing this feature left and right, such as for creating visual calculations that can be better implemented as regular DAX measures, e.g. for summing or averaging values. And not before long, such users would find themselves into an Excel-like spreadmart hell which is what they tried to avoid by embracing Power BI.

Therefore, please use these visual calculations only when regular DAX measures will not suffice. Business metrics should be centralized and should return consistent results, no matter the reporting tool or visual they are placed in. This is important for achieving the elusive single version of truth.

Implementing “Generic” Percent of Grand Total in DAX

Suppose you need to calculate a percentage of grand total measure. Easy, you can use the Power BI “Show value as” without any DAX, right? Now suppose that you have 50 Table visuals and each of them require the same measure to be shown as a percentage of total. Although it requires far more clicks, “Show value as” is still not so bad for avoiding the DAX rabbit hole. But what about if you need this calculation in another measure, such as to implement a weighted average? Now, you can’t reference the Microsoft-generated field because it’s not implemented as a measure.

That’s exactly the scenario I faced while working on a financial report, although at the end I followed another approach to calculate the weighted average that didn’t require a percentage of total. Anyway, the question remains. Is there a way to implement a “generic” percent of grand total for a given measure that will work irrespective of what dimensions are used in a Table or Matrix visuals? Consider the following simple report.

We want to show sales as a percentage of total irrespective of what dimension(s) are used in the report. Typically, to implement percentage of total measures you’d implement a DAX explicit measure that overwrites the filter context, such as:

% SalesAmount =
VAR _TotalSales = CALCULATE(SUM(ResellerSales[SalesAmountBase]), ALL('Product'))
RETURN
DIVIDE (ResellerSales[SalesAmount], _TotalSales)

This measure uses the ALL function to remove the filter from the Product table to calculate the sales across any field in that table. But we want this measure to work even if fields from other tables are used as dimensions.

Enter the magical ALLSELECTED function. From the documentation, ALLSELECTED “removes context filters from columns and rows in the current query, while retaining all other context filters or explicit filters”, which is exactly what’s needed. That’s because we want to ignore the context from fields used in the visual but apply other filters, such as slicers and visual/page/report filters, and cross filtering from other visuals.

And so, the formula becomes:

% SalesAmount =
VAR _TotalSales = CALCULATE(SUM(ResellerSales[SalesAmountBase]), ALLSELECTED())
RETURN
DIVIDE (ResellerSales[SalesAmount], _TotalSales)

And that’s all to it except if you need a percentage of column total in a Matrix visual that has a field in the Columns bucket. In this case, ALLSELECTED will ignore not only the dimensions on rows but also dimensions on columns. Then, the net effect will be a generic measure that calculates the percent of grand total instead of column total.

By the way, if you use the “Show value as” built-in feature and capture the query behind the visual, you’ll see that Microsoft follows a rather complicated way to calculate it to handle this scenario. Specifically, the visual generates two queries, where the first computes the visual totals and the second computes the percentage of total.

Testing DAX Measures

DAX can get complex and humble even experienced BI developers. Since Microsoft left us without a proper debugger, here a couple of techniques that I use to debug DAX when going gets tough:

  1. Variables – I often break down the formula in variables. As a bonus, variables make expressions easier to read and might yield performance gains. Consider the following SalesYoY% measure that calculates the variance in sales between the current period and same period last year.
    SalesYoY% =
    VAR _LastYearSales =
    CALCULATE (
    [InternetSales],
    SAMEPERIODLASTYEAR ( 'Date'[Date] )
    )
    RETURN
    IF (NOT ISBLANK([InternetSales]), DIVIDE ([InternetSales] - _LastYearSales, _LastYearSales))

    Let’s say you believe that last year’s sales look suspicious, such as the high value for 2011. You can comment the last line and return the LastYearSales variable to investigate further.

    SalesYoY% =
    VAR _LastYearSales =
    CALCULATE (
    [InternetSales],
    SAMEPERIODLASTYEAR ( 'Date'[Date] )
    )
    RETURN
    _LastYearSales
    --IF (NOT ISBLANK([InternetSales]), DIVIDE ([InternetSales] - _LastYearSales, _LastYearSales))
  2. The EvaluateAndLog function – With the introduction of the EvaluateAndLog function in Power BI, Microsoft has provided us with a poor man’s debugger that can print the output of a DAX expression. For example, to investigate the last year’s sales, you can enclose it with EvaluateAndLog.
    SalesYoY% =
    VAR _LastYearSales =
    CALCULATE (
    [InternetSales],
    SAMEPERIODLASTYEAR ( 'Date'[Date] )
    )
    RETURN
    IF (NOT ISBLANK([InternetSales]), DIVIDE ([InternetSales] – EvaluateAndLog(_LastYearSales), _LastYearSales))

    Then, you can use the SQL Server Profiler (distributed with SSMS) or the tool mentioned in the blog to see the output. I have the SQL Server Profiler registered as a Power BI Desktop external tool using the steps in this blog so that I can conviniently launch it connected to the Power BI Desktop model. Once the profiler opens, make sure to select the “DAX Evaluation Log” event.

    In this case, the event outputs the last year’s sales broken by year because the visual slices the measure by years.

Estimating DAX Query Completion Time

Usually, DAX queries execute very fast, like flashes in a gold seeker’s pan. Sometimes, however, you could end up with a massive DAX query that takes minutes if not hours. For example, a financial institution requested credit monitoring results to be evaluated overnight and saved in a relational database for fast retrieval. The query involved processing some 300+ measures for 40 million customers and took about an hour to complete. While working on optimizing the query and tracing its execution, I enabled the VertiPaq SE Query End event and was able to monitor how far the query got by seeing which measure is being processed. This was also useful to understand which measures are more expensive.

In general, the server doesn’t process measures in parallel. A measure may produce multiple storage queries, some in parallel and some sequentially. The VertiPaq SE Query End event would show the SQL-like query that the server sent to the storage engine, but usually you can easily correlate it to the DAX formula of the measure being processed. Thanks to the performance enhancements Microsoft is making, such as horizontal fusion, multiple measures may be coalesced into the same storage operation, and therefore a query with multiple measures might yield parallelism or could be sequential. On the other hand, there is plenty of parallelism within a Vertipaq query.

Atlanta MS BI and Power BI Group Meeting on June 6th (How Power Query Thinks)

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, June 6th, at 6:30 PM ET.  For more details and sign up, visit our group page.

Presentation:How Power Query Thinks: Taking the Mystery Out of Streaming and Query Folding
Date:June 6th
Time:6:30 – 8:30 PM ET
Place:Click here to join the meeting
Overview:How does Power Query produce the table data your expressions ask it to output? Query folding is one key concept—where behind the scenes, part or all of your M code may be rewritten into the data source’s native query language then offloaded to the source for execution. Streaming, though perhaps a less familiar term, is even more fundamental, as it describes how table data (as well as list and binary data) flows between functions in M.

 

Understand these concepts, and you’ll be better positioned to write more efficient mashups, debug problems and avoid unexpected variability in results. Join this session to learn about these key concepts—in a nutshell, to learn about How Power Query Thinks when you ask it to produce table data!

 

Speaker:Author of the Power Query M Primer Series, an in-depth dive into the Power Query language, Ben Gribaudo is a seasoned architect, developer and data engineer.
Prototypes without PizzaPower BI Latest

PowerBILogo

Atlanta MS BI and Power BI Group Meeting on April 4th

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, April 4th, at 6:30 PM ET.  Michael Carlo (Power BI MVP) will show how to speed up your data modeling experience by using DAX templates. For more details and sign up, visit our group page.

Presentation:World’s Fastest DAX – Using Quick DAX Templates
Date:April 4th
Time:6:30 – 8:30 PM ET
Place:Click here to join the meeting
Overview:DAX is easy to Learn but hard to master.

This session is all about using the Growing Community Driven Library of Quick DAX Templates. Learn how to find the templates and use them in your daily workflow to drastically speed up your Data Modeling Experience.

Session Covers the following topics:

1. Introduces the concept of Templatized DAX

2. Consuming a Quick DAX Template using Power BI External Tools

3. Discusses how to create your own templates and use them

Multiple demos of working directly inside the tooling for consuming and creating Quick DAX templates.

Speaker:Michael Carlo is very passionate about data and analytics. He spent a massive amount of time learning Power BI. Through this learning process he felt that others may also be interested. Thus, he created a website, a knowledge repository for all things PowerBI (powerbi.tips). This site is uniquely set up for people to read and learn about data modeling and visualizations within Power BI.
Prototypes without PizzaPower BI Latest

PowerBILogo

Referencing Columns in DAX Table Variables

Suppose you use a DAX table variable, such as to group by certain columns and add an extension column as a calculation. Then, you want to count the rows in the table by filtering on one of the columns. At your first attempt, you might try using CALCULATE.

VAR _t = ADDCOLUMNS(SUMMARIZE(…), "SomeColumn", <some expression>)
RETURN
CALCULATE(COUNTROWS(_t), _t[SomeColumn] = something)

You’ll get an error that the column you reference cannot be found. To get this to work, you must use FILTER (notice that code uses [SomeColumn] instead of the column fully qualified name (_t[SomeColumn).

VAR _t = ADDCOLUMNS(SUMMARIZE(…), "SomeColumn", <some expression>)
RETURN
COUNTROWS(FILTER(_t, [SomeColumn] = something)

Why doesn’t CALCULATE work? CALCULATE changes the filter context, but DAX filter context only allows columns in the model. It doesn’t allow “extension” columns created in a DAX expression. [SomeColumn] does not exist in the model but only in the DAX expression and therefore is not visible by CALCULATE.

On the other hand, FILTER resolves [SomeColumn] in row context. [SomeColumn] is bound to the column in table expression _t and it can be resolved.

Speaking of grouping, you might have noticed that Power BI Desktop uses SUMMARIZECOLUMNS in autogenerated DAX queries instead of ADDCOLUMNS(SUMMARIZE()). However, if you attempt to use SUMMARIZECOLUMNS in a measure, you’ll get an error “SummarizeColumns() and AddMissingItems() may not be used in this context”. How come?

SUMMARIZECOLUMNS is not supported in measures because it was designed for resultset-producing, autogenerated DAX queries. It wasn’t enabled for measures because it has different semantics when filters exist in the filter context, as in the measure scenario, and when filters are passed to it as arguments, as in the DAX query scenario. If you want to use SUMMARIZECOLUMNS in a measure to replicate the query results of a visual, you may not get the expected results due to the different ways the function treats filters from different sources and so it can’t be used in measures.

Rogue Q&A Queries

I’ve noticed severe performance degradation after refreshing a Power BI Desktop model with some five million rows. The Power BI Desktop process showed a sustained 50-60 % utilization for minutes in the Windows Task Manager. I did a profiler trace and I saw expensive DAX queries like these:

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[PONumber]),LEN(‘Sales'[PONumber])<=100),”valueColumn”,’Sales'[PONumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[SalesOrderNumber]),LEN(‘Sales'[SalesOrderNumber])<=100),”valueColumn”,’Sales'[SalesOrderNumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[InvoiceNumber]),LEN(‘Sales'[InvoiceNumber])<=100),”valueColumn”,’Sales'[InvoiceNumber])

As it turned out, Power BI Desktop autogenerates these queries when building a Q&A index. The 100-size limit is because Power BI wants to keep the index small. In addition, values that are longer than 100 characters are unlikely to be asked by the user. Why not check thd the maximum column value and skip the column? Power BI wants to skip instances that are too long but still index the remaining instances of the column.

To avoid this performance degradation when modeling on the desktop you could disable the Q&A feature. This will also disable smart narratives because they depend on Q&A.

To do this, go to the File, Options and Settings, Options, and turn off the Q&A option.

If Power BI Desktop is connected to a remote model, such as a published Power BI dataset, you’ll see also an option to create a local index. This option was added because Power BI needs to ask user permission to query data from remote sources, build the data index, and store it on user’s machine. By default, it’s disabled until the user explicitly turns on Q&A. For import models, as the data is already on user’s machine, Power BI doesn’t need to ask the permission to query data anymore. That’s why the option to build a local index is not applicable to models with imported data.

Disabling the Q&A in Power BI Desktop affects the local file only. When you publish the model, you reenable Q&A from the data settings if you want end users to use Q&A features. For remote models, if you leave the first option, “Turn on Q&A to ask …”, on, but disable the second option, “Create a local index….”, and publish the model to the service, then Q&A will be enabled in the service by default. That is, you don’t have to go to dataset settings to enable Q&A for that model. For import models, you have to disable the first option, and then after publishing the model to the service, you have to go to dataset settings to enable Q&A there.

Propagating DAX BLANK() Over Relationships

Happy 4th of July with a DAX puzzle!

Consider the following three tables. Notice that InvoiceDetails has a data integrity issue where the third row doesn’t have InvoiceNo (InvoiceNo is blank).

What would a measure with COUNTROWS(Invoice) return for Teo? BLANK(), right? And that’s correct because Teo has no invoices. Now, what would COUNTROWS(InvoiceDetails) return for Teo? BLANK() again given that Teo has no line items? Nope, it returns 1. In other words, it will return the count of all line items whose InvoiceNo is blank. How come?

COUNTROWS(Invoice) returns blank for all customers with no matching rows in Invoice because these customers are mapped to the blank row in Invoice, which is excluded from COUNTROWS by default. But if you change COUNTROWS(Invoice) to COUNTROWS(VALUES(Invoice)), then 1 will be returned for these customers. Then, the filter flows to InvoiceDetals through the blank row and counts all rows with blank InvoiceNo.

This behavior wouldn’t have happened in SQL. However, DAX blank is not SQL null. Blank sales in DAX means no sales or zero sales, but null sales in SQL means unknown sales. Therefore, in DAX blank = blank, but in SQL null is always different than null.

Thanks to the Father of DAX, Jeffrey Wang, for shedding light in dark places.