Posts

A Couple of Direct Lake Gotchas

I’m helping an enterprise client modernize their data analytics estate. As a part of this exercise, a SSAS Multidimensional financial cube must be converted to a Power BI semantic model. The challenge is that business users ask for almost real-time BI during the forecasting period, where a change in the source forecasting system must be quickly propagated to the reporting the layer, so the users don’t sit around waiting to analyze the impact. An important part of this architecture is the Fabric Direct Lake storage to eliminate the refresh latency, but it came up with a couple of gotchas.

Performance issues with calculated accounts

Financial MD cubes are notoriously difficult to convert to Tabular/Power BI because of advanced features that aren’t supported in the new world, such as Account Intelligence, scope assignments, parent-child hierarchies, and calculated dimension members. The latter presented a performance challenge. Consider the following MDX construct:

CREATE MEMBER CURRENTCUBE.[Account].[Accounts].&[1].[Calculations List].[ROI %]
AS IIF([Account].[Accounts].&[1].[Calculations List].[Average Invested Capital] = 0, NULL, ..,
FORMAT_STRING = "#,##0.0 %;-#,##0.0 %",

This construct adds an expression-based account as though the account is physically present in the chart of accounts. MD evaluates the MDX expression only for that account.


No such a construct exists in Tabular. To provide a similar reporting experience, I attempted to overwrite the Value measure conditionally based on the “current” account, such as:

VAR _Level03 = SELECTEDVALUE ('Account'[Level 03]) 
RETURN 
IF (_Level03 <> "Calculations List", 
       [Value],
       VAR _Level04 = SELECTEDVALUE ('Account'[Level 04])
       RETURN
            SWITCH (
               Level04,
               "ROI %", [ROI],
               "Economic Profit", [Economic Profit],
…

However, no matter what I tried, the report performance got a big dent (from milliseconds to 10+ seconds) even when the Calculations List account was excluded. Interestingly, report performance in Direct Lake fared 2-3 times worse than an equivalent Power BI imported model.

So, we had to scrap this approach in favor of one of these workarounds:

  1. Pre-calculating the calculated accounts values (materializing)
    1. Pros: same reporting behavior as MD, faster performance compared to MDX expressions
    2. Cons: effort shifted to ETL, potentially impacting real-time forecasting if calculations must be recomputed with each change.
  2. Separate DAX measures
    1. Pros: formulas applied at runtime as MD, no impact on ETL
    2. Cons: different report experience

Excel dropping user-defined hierarchies

Excel never fails to disappoint me. Sad, considering its potential as an alternative reporting client, especially for financial users.

This time Excel pivots decided not to show user-defined hierarchies, which turns out to be a document limitation for DirectQuery and Direct Lake. Microsoft provides no explanation and I’m sure the Excel team has no plans to fix it, as well as to finally embrace DAX and Power BI semantic models.

Luckily, the client uses a third-party Excel-based tool, which provides better report experience and supports user-defined hierarchies. If the Excel limitation becomes an issue, Fabric Direct Lake is expected soon to support composite models. This will let you implement models with hybrid storage, such as importing dimensions, which don’t change frequently, but leave fact tables in Direct Lake. Luckily, Excel supports user-defined hierarchies with imported tables.

Considerations for Detail Reports

Nobody likes watching a report spinny. Interactive detail reports that perform well from an Analysis Services semantic layer have been the bane of my BI career. A “detail report” is a report that requests data at a lower level, e.g. policy in the insurance business, customer in Sales, etc. A detail report typically has many dimension attributes, eustomer Name, Account Number, Product Name, Product Number, etc. And, the more columns you add, the slower the report gets. The reason why such reports don’t typically perform so well when generated from a semantic layer is that Analysis Services is not SQL Server.

Multidimensional is an attribute-based model and the server cross joins the member values when you add attributes to the report. Don’t be misled by the “relational” nature of Tabular either. Its database engine (xVelocity) is an in-memory columnar database that still cross joins the column values. That said, Tabular should give you a performance boost for two reasons. First, its in-memory nature is generally faster than Multidimensional. Second, Excel has been optimized for Tabular, as I explain in my “Optimizing Distinct Count Excel Reports” blog, thanks to the undocumented PreferredQueryPatterns settings (although not listed in the msmdsrv.ini, it defaults to 1).

In a recent project, I migrated a customer from Multidimensional to Tabular. Besides other benefits, such of elimination of snapshots, reducing dramatically ETL time and analyzing data as of any date (not just at the month end), the customer wanted to improve performance of Excel detail reports. Previously, some of the detail reports would never return. After the migration, these reports would execute within a minute, and in seconds if the Excel subtotals are disabled.

Here are some tips you might find useful if you’re tasked to produce detail reports:

  1. If the detail reports don’t require too much interactivity, consider implementing them as SSRS paginated reports that connect directly to the database. The chances are that this approach will give you the fastest performance as the Database Engine is designed to retrieve data in sets by rows and the number of columns won’t probably affect report performance (unless of course many joins are required).
  2. If it’s desired to connect these reports to a semantic layer, consider Tabular because it’s better for wide & flat results.
  3. If Excel is used as a front end, consider Excel 2016 as Microsoft has made various performance improvements for detail reports.
  4. Consider disabling Excel subtotals, as explained here. In my scenario, a detail report with no subtotals would execute under 15 seconds. However, if I enable just one column subtotal, Excel switches to a completely different query pattern (with many nested DrilldownMember levels) and the report query would take a minute. That’s because now the query needs to obtain the subtotals for each group from SSAS. Since we’re at the mercy of the Excel MDX query generator, I hope the Excel team finds a way to produce more optimal MDX queries. Ideally, a future Excel release would allow binding Excel native tables directly to SSAS with ability to define subtotals and optimized queries to load the data.
  5. In my experience, Power BI reports that generate DAX don’t perform necessarily any better. To make things worse, while the new PBI Matrix visual can be configured to a flattened layout, it doesn’t currently support disabling specific subtotals (currently, you can remove all row subtotals or have them for each and every column). And asking for subtotals for every column might not only contradict business requirements but it could also severely affect performance. UPDATE 8/14/2017 – The August release of Power BI Desktop supports configuring row subtotals per level.

Many thanks to Akshai Mirchandani for the SSAS product group for not losing patience throughout all these years from my complaints on this subject.

081217_1845_Considerati1.jpg

Understanding Writeback Target Allocation

I’m working on architecting a financial planning solution powered by Analysis Services Multidimensional. One thing that might not be obvious is how Multidimensional selects the target of writeback allocation. In this case, planning will be done at Customer and Product level. With the default equal allocation when writing at the customer level, it might appear that writeback doesn’t work correctly. You’d expect that only the cells that contribute to the aggregated value (10 in the screenshot below) will be affected by writeback. However, if the Customer and Product entities are in different dimensions, writeback will affect all products!

The reason behind this becomes obvious if you right-click the pivot table, and from its options enable “Shows rows with no data”. Then, you’ll see all products appearing with each customer (customers are crossjoned with products). Recall that by default, the pivot table uses NON EMPTY in MDX query to exclude combinations that don’t exist in the cube. But writeback makes no such assumptions. The reason for this is that the writeback cell is empty, then there is nowhere the writeback value will be allocated to. If the Customer and Product entities are in the same dimension, then the default equal allocation will write to all children of the affected parent, irrespective if their values contribute to its aggregated cell.

So, writeback is not the same as drilling through a cell. Now that you know how it works you can use different allocation settings to achieve the behavior you want. For example, you can choose a weighted allocation with the following expression to avoid writing back to empty values:

iif(Measures.CurrentMember = 0, null, Measures.CurrentMember)

Trend Lines in Power BI Charts

Recently, Power BI charts introduced trend lines. However, they require numeric or date values on X-axis, which must have a continuous type. In fact, if you use a text field for the X-axis, a warning indicator will be displayed in the top left corner of the chart to warn you that non-numeric values are used.

072916_0110_TrendLinesi1.png

This requirement presents issues if the report is connected to a Multidimensional cube because by default all attributes are text-based. As a workaround, in the Multidimensional project set the ValueColumn property of the attribute to a column in the underlying table of a numeric or date data type, and deploy the cube.

072916_0110_TrendLinesi2.png

Back to Power BI Desktop, bind the corresponding .Value field to the X-axis.

072916_0110_TrendLinesi3.png