Posts

A First Look at DAX Visual Calculations: the Good, the Bad, and the Ugly

The February 2024 release of Power BI Desktop includes a preview of visual calculations. As its name suggests, a visual calculation is a visual-scoped DAX measure that works at the aggregate (visual) level.

The Good

Visual calculations make previously difficult tasks much easier. Consider the following matrix:

Suppose you need a measure that calculates the difference between the product categories in the order they were sorted in the visual. Implementing this as a regular DAX measure is a challenge. Yet, if we had a way to work with the cells in the visual, we can easily find a way to get this to work. Ideally, this would work similar in Excel, but DAX doesn’t know about relative references. However, visual calculations do (kind of).

Let’s right-click on the visual and select “New calculation”. In the visual-level DAX formula bar, enter the following formula:

Category Diff = [Sum of SalesAmount] - PREVIOUS([Sum of SalesAmount], COLUMNS)

The Category Diff measure computes the difference between the current “cell” and the cell for the previous category. For example, for Bikes it will be $9,486,776-$28,266 (Accessories value).

Notice the use of the PREVIOUS function which is one the new DAX functions specifically designed for visual calculations. Notice also that the PREVIOUS function has additional arguments and one of them is the AXIS argument which specifies if the function should evaluate cells positionally on columns or rows. Finally, notice that visual-level formulas can only reference fields or measures placed in the visual.

The Bad

Among the various limitations, the following will cause some pain and suffering:
1. A visual calculation effectively disables exporting the visual data.
2. Drillthrough is disabled.
3. Can format visual calculations unless the visual supports it (the matrix visual does support measure formatting).
4. Can’t apply conditional formatting.
5. Can’t change sort order.
6. Can’t use field parameters.

The Ugly

The way Microsoft advertises this feature is that it is “easier than regular DAX”. My concern is that tempted by that promise, users will start abusing this feature left and right, such as for creating visual calculations that can be better implemented as regular DAX measures, e.g. for summing or averaging values. And not before long, such users would find themselves into an Excel-like spreadmart hell which is what they tried to avoid by embracing Power BI.

Therefore, please use these visual calculations only when regular DAX measures will not suffice. Business metrics should be centralized and should return consistent results, no matter the reporting tool or visual they are placed in. This is important for achieving the elusive single version of truth.

Atlanta Microsoft BI Group Meeting on March 4th (Navigating Microsoft Fabric – Choosing the Right Workload for Your Needs)

Atlanta BI fans, please join us in person for the next meeting on Monday, March 4th at 6:30 PM ET. The famous Patrick LeBlanc (Guy in the Cube) will take a deep dive into the Microsoft Fabric ecosystem, from Lakehouse to Warehouses and Power BI, ensuring you can make informed decisions about your data processing needs. Your humble correspondent will help you catch up on Microsoft BI latest. CloudStaff.ai will sponsor the event. For more details and sign up, visit our group page.

Presentation: Navigating Microsoft Fabric – Choosing the Right Workload for Your Needs

Delivery: In-person

Date: March 4

Time: 18:30 – 20:30 ET

Level: Beginner/Intermediate

Food: Pizza and drinks

 

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Microsoft BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Overview: As businesses transition to the cloud and leverage advanced analytics, understanding the nuances of data infrastructure becomes paramount. Microsoft Fabric offers a suite of powerful tools designed to handle various data workloads, but the key to harnessing its full potential lies in understanding which tool to use and when. This session provides a deep dive into the Microsoft Fabric ecosystem, from Lakehouse to Warehouses and Power BI, ensuring that participants can make informed decisions about their data processing needs. We’ll also look at current limitations that will help guide you.

Speaker: Patrick LeBlanc is a currently a Principal Program Manager at Microsoft and a contributing partner to Guy in a Cube. Along with his 15+ years’ experience in IT he holds a Master of Science degree from Louisiana State University. He is the author and co-author of five SQL Server books. Prior to joining Microsoft, he was awarded Microsoft MVP award for his contributions to the community. Patrick is a regular speaker at many SQL Server Conferences and Community events.

Sponsor: CloudStaff.ai

PowerBILogo

Know Thyself: Power BI Source Control

Last year my wife and I did a tour of Greece, and we had a blast. Greece, of course, is the place to go if you are interested in ancient history and the origin of democracy. One of the places we visited was Delphi. The ancient Greeks believed it to be the center of the universe. Now not much was left of it except lots of ruins and imagination. But back then it was magnificent. People from all over the world would come to consult with the Oracle of Delphi. She delivered her prophecies from the temple of Apollo, which had three inscriptions, with one of them being “Know thyself”. The practical benefit for the oracle was that if you believed her cryptic prophecy wasn’t fulfilled then your interpretation was wrong. Therefore, the problem was in you because you didn’t know yourself.

How does this translate into BI? I see clients overly excited about Microsoft Fabric/Power BI Premium, believing that bundling features will solve all their issues. But knowing your organization, ask yourself if your users would use all these features to justify the premium price. A case in point: Power BI source control via workspace Git integration: a feature that appear to be created from developers for developers. Kristyna Hughes did a great presentation for our Atlanta BI Group on Monday covering how developers can take the most of this feature.

Given the self-service focus of Power BI, however, I doubt that data analysts would subject themselves to learning Azure DevOps, Visual Studio Code, and Git CI/CD. Yet, Power BI source control has been in demand since the beginning with the most common ask – the ability to roll back changes.

Here is my take to simplify Power BI source control for regular users:

Power BI Premium/PPU/Fabric clients

  1. If you are on Power BI Premium, set up a branch for each workspace that you want to put under source control, and configure the workspaces for Git integration.
  2. Let business users publish changes as usual.
  3. Periodically and as a part of the change management process, the workspace admin approves the changes and commits them to source control. I hope one day Power BI would transparently commit changes to Git as Azure Data Factory does it, without requiring explicit synchronization. Meanwhile, the admin must manually commit.
  4. Someone privileged to Azure DevOps would need to roll back changes if needed. Again, I hope one day history review, compare, and roll back will be baked in Power BI.

Power BI Pro clients

  1. Once this feature is generally available, embrace Power BI Desktop projects.
  2. When significant changes are made, back up report and model.bim json files to some location, such as OneDrive which has built-in version control.
  3. Replace the project files when you need to roll back changes. Again, this “poor man” source control emphasizes simplicity and saves premium licenses.

Atlanta Microsoft BI Group Meeting on February 5th (Power BI – Time to Git CI/CD)

Atlanta BI fans, please join us for the next meeting on Monday, February 5th at 6:30 PM ET. Note that this will be an online meeting via MS Teams and non-Atlanta BI fans are also welcome. Kristyna Hughes (Senior Consultant with 3Cloud) will show us how to implement source control for Power BI reports. Your humble correspondent will help you catch up on Microsoft BI latest. For more details and sign up, visit our group page.

Presentation: Power BI – Time to Git CI/CD

Delivery: Online

Date: February 5

Time: 18:30 – 20:30 ET

Level: Intermediate

Food: NA

 

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Microsoft BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Overview: As report developers, requirements for reports are constantly evolving which leads the reports themselves to change. However, inevitably, there will be one party who uses that report and was unaware of changes. They may request a list of changes or want that report rolled back to a previous version. With git, we can provide a list of changes and roll back to any version requested.

Taking a page from application developers, Power BI developers can now develop reports in branches and merge changes using git and Azure DevOps! If half those words are new to you and you’re looking for a better way to manage version history, come to this session! We will go through the basics of git including what branches are, how to use them, and some best practices while working with git. Git has a few tricks that we will go through as well like fetching a branch, in-line comparisons, and pulling to avoid merge conflicts. Don’t worry, we will also touch on how to resolve merge conflicts as well as how to avoid them. After the basics of git, we’ll walk through how to use git with Power BI workspaces and Azure DevOps.

Speaker: Kristyna Hughes is a senior data & analytics consultant at 3Cloud. Her experience includes implementing and managing enterprise-level Power BI instance, training teams on reporting best practices, and building templates for scalable analytics. Passionate about participating and growing the data community, she enjoys co-writing on Data on Wheels (dataonwheels.com) and has co-founded Data on Rails (dataonrailsblog.com). She also is a co-organizer for Lexington Data Technology Group and co-host of Wednesday Lunch & Learns on YouTube with Chris Wagner.

PowerBILogo

Notes on Fabric F2 Performance: Report Load

What a better way to spend a lazy holiday afternoon than to do more Fabric performance testing? In my previous post, I shared my results from a single-threaded ETL load test to gauge the F2 ingest performance and F2 did pretty well (or at least outperformed Azure SQL DB). Will F2 hold as parallelism increases? Throughput testing is especially important for report loads because parallel tasks can run within a report, such as visuals executing DAX queries in parallel, and across reports, such as when concurrent report requests overlap.

I used the artifacts are included in the “Microsoft Dashboard in a Day” for this test and load tested only the first report page.

The Sales fact table in the semantic model has over seven million rows so it represents a good size dataset. Naturally, the more involved the report is and the more data the semantic model has, the more CPU power and parallelism are needed. I used the Microsoft Power BI Dedicated Capacity Load Assessment Tool and configured it to filter the report on different years in order to avoid report caching. I ran four tests for 1, 2, 3 and 4 virtual users with no think time, and each test ran for 10 minutes.

Here are the results:

UsersTotal report rendersRenders per user
1631×63
21002×50
31353×45
41644×41

Here are some additional findings:

  1. Within the first minute or so, F2 generates reports fast, presumably because bursting comes into play to let F2 borrow and recruit more CPU resources. As the sustained load continues, Fabric starts scaling back and throttling CPU.
  2. As time goes by, report executions are getting increasingly slower. While during the first minute a report can take 1-2 seconds for example, later it might take as much as 40 seconds to render (users are unlikely to tolerate this). This happens even with one virtual user, presumably because the quarter of the core is insufficient.
  3. I haven’t encountered any errors. All report executions succeeded irrespective of how long they take.

What all these tests mean is that if Fabric is appealing to you, F2 can be a viable option for smaller organizations where report users are expected to run reports sporadically. In case of a sudden load, such as everyone running reports at 8 AM on Monday, Fabric bursting can elevate the pressure for the first minute or so. Of course, you need to weigh in many other factors, such as relenting control to Microsoft, waiting for the technology to mature, avoiding lock-in, evaluating budget (note that viewers would still need at least Power BI Pro individual licenses for capacities lower than F64), and various other considerations I covered in my previous Fabric-related posts.

Notes on Fabric F2 Performance: Warehouse ETL

As inspired by Amir Netz‘s encouragement to partners to test the Fabric F2 capacity performance, I got on a quest to test what it would do to ETL loads for Fabric Warehouse. I must admit that I was skeptical that a quarter of a core would take a warehouse off the ground, but as usual, life proved me wrong and “wrong” is a big understatement of what happened.

After provisioning a Fabric F2 capacity and a warehouse, I settled on the Retail Data Model for World Wide Importers sample star schema dataset consisting of five dimension tables and one fact table. In terms of performance, I was mostly interested in how long it would take for the ADF copy activity to insert all the data (50 million rows) in the fact table. Granted, it’s a limited test but enough to rule out the technology for real-life projects. Then, I compared the performance against Azure SQL Database Serverless running on up to 2 cores and provisioned by the free trial offer that Microsoft has on Azure. To exclude impact on data transfer between regions, both technologies were provisioned on East US 2 data region, which is the region where my Power BI tenant is hosted on.

Much to my surprise, it took less than two minutes to load all 50 million rows in F2, whereas it took 1 hour to load to load 27 million rows to Azure SQL Database before the maximum 30 GB disk space was exhausted! I couldn’t believe it so I ran the test three times to confirm. Surely, bursting helps a lot! Now, throughput would be a different story, but as far as the warehouse is concerned, it doesn’t matter because in most cases, data will be imported in a Power BI semantic model and the warehouse will be out of the picture. As a next step, I plan to test the report throughput to see what concurrent report load would saturate the F2 capacity.

In conclusion, the Fabric lowest capacity F2 ($262.80 monthly cost) could be a viable option for smaller organizations willing to make their foray in the Fabric world. On the downside, we must leave Fabric to marinate for a few months and add needed features, including surrogate keys and MERGE for Warehouse and on-prem connectivity for Azure Data Factory, in order to be in consideration for real-life projects. More tests are needed to gauge the F2 report throughput.

UPDATE 12/24/2023 I was curious how much loading the same dataset from a CSV file would impact performance. It took much longer: 18 minutes. The most significant factor was that loading from CVS requires staging to a data lake although this appears redundant because the CVS file was in a lakehouse in the same Power BI workspace. ADF spent a total of 18 minutes in the two-staged copy (ten minutes to stage the data and eight minutes to load the fact table from the staged copy). Therefore, Parquet outperformed significantly CVS, probably because the Microsoft-provided Parquet file was compressed.

Prologika Newsletter Winter 2023

I’ve covered my first impression about Microsoft Fabric in a series of blogs starting with this one. In the recap, I said that certain features might benefit organizations that are already on Power BI Premium because all Fabric features are available in Power BI Premium plans. However, as they stand, features such as Lakehouse Copy to Table (no automated synchronization), ADF pipelines (no connectivity to on-prem data sources), Data Warehouse (missing important T-SQL features), Direct Lake (only available online), are half-baked, thus barring wide-spread Fabric adoption. This newsletter covers a good case for Fabric that I’ve recommended for a large insurer that has invested heavily in Microsoft Dynamics 365 and Power BI Premium. I’ll be in a better position to confirm after a POC with larger datasets so follow my blog for future updates.

The Dynamics Integration Saga

Dynamics Online, aka Dynamics 365, epitomizes the customer’s struggle to export their data hosted in SaaS cloud offerings for analytics or other purposes. Since unfortunately Microsoft doesn’t provide direct access to the Dynamics native storage (Azure SQL Database), which often is the best solution, Dynamics has “compensated” throughout the years by provided the following options:

OptionProsCons
OData APIOriginal interfaceSlow, no query folding, currently deprecated
Data Export ServiceExport to Azure SQL DatabaseDeprecated
TDS endpointReal-time direct access to Dynamics Azure SQL DatabaseNot designed for massive extracts; slow
Dataverse APICurrently recommended by MicrosoftAs far as I can tell, no REST operations (see below); questionable performance
Dataverse connectorSame as Dataverse APISame as Dataverse API
Export to ADLSAutomatic synchronizationPerpetually in preview, files stored in proprietary CDM folder format
Fabric linkAutomatic synchronization; delta tables; SQL joinsRequires Fabric licensing; up to 60 min data sync delay; TBD

One of my clients, a large insurance company, has invested heavily in Dynamics 365. They have implemented various Power BI reports that would import millions of rows directly from common CRM entities, such as Account, Opportunity, as well as custom entities, using the OData API. The main struggle has always been that dataset refreshes take hours to complete. Because API calls don’t fold in Power Query, they apply REST operations to select a subset of columns, join tables, and filter data. They rely heavily on Power Query to shape the data which has worked reasonably well.

Dataverse Link to Fabric

Dataverse Link to Fabric looks promising in addressing some of the above pitfalls. Once you set it up, it extracts Dynamics data as Delta tables in a Fabric lakehouse. From there, you can use the SQL endpoint to join and filter tables.

After the initial synchronization, it might take up to 60 minutes to synchronize data changes in Dynamics so not really real-time data access but much better than the current state.

While working through the setup, keep the following in mind:

  1. The Link to Microsoft Fabric menu is found under the Analyze menu in the ribbon and not at the table level as the video shows. That’s because it exports all the tables.
  2. Referring to the comparison table between Fabric Link and Synapse Link, “No copy, no ETL direct integration with Microsoft Fabric” should be read as “No ETL” because the data is still copied. Further, “Data stays in Dataverse” should read as “Data is automatically copied to Fabric lakehouse”.
  3. As far as prerequisites, the most important one is “the workspace you choose to link with Dataverse must be assigned to a premium capacity in the same region as your Dataverse environment” with the emphasis on the “same region”. Not having them in the same region is a show stopper.

Cutting Cost

What if you are not on Power BI Premium and you are on a tight budget? Although I haven’t tested, this approach (suggested by Riccardo Perico) might work because Link to Fabric is available in all fabric capacities:

  1. Purchase a lower Fabric capacity, such as F2 or F4.
  2. Configure Dataverse Link to Fabric as usual.
  3. After the initial synchronization, pause and resume the capacity when not in use, such by using this approach.
  4. Import the data in a Power BI semantic model.

Dataverse Link to Fabric should queue the updates until the capacity is resumed.

Benefits

Again more testing is needed, but at this point I see the following advantages of using the Dataverse Link to Fabric:

  1. CRM data is staged in open file format (Delta Parquet tables) and available for any reporting needs without impacting Dynamics.
  2. No need to use APIs. Instead, you can use SQL to join and filter tables, such as to look up columns or filter data, which is much better than using the API syntax. This will also help you centralize column lookups in SQL views instead of doing this over and over for each report.
  3. Potentially much faster Power BI dataset refreshes
  4. No additional cost to Power BI Premium customers because the Fabric features are available in Power BI Premium.


Teo Lachev
Prologika, LLC | Making Sense of Data
logo

Atlanta Microsoft BI Group Meeting on November 6th (Exploring Bravo for Power BI)

Atlanta BI fans, please join us for the next meeting on Monday, November 6th, at 6:30 PM ET. Macro Russo will introduce us to the community tool Bravo for Power BI. Your humble correspondent will help you catch up on Microsoft BI latest. I’ll sponsor the event. For more details and sign up, visit our group page.

PLEASE NOTE A CHANGE TO OUR MEETING POLICY. WE HAVE DISCONTINUED ONLINE MEETINGS VIA TEAMS. THIS GROUP MEETS ONLY IN PERSON. WE WON’T RECORD MEETINGS ANYMORE. THEREFORE, AS DURING THE PRE-PANDEMIC TIMES, PLEASE RSVP AND ATTEND IN PERSON IF YOU ARE INTERESTED IN THIS MEETING.

Presentation: Exploring Bravo for Power BI

Delivery: In-person

Date: November 6

Time: 18:30 – 20:30 ET

Level: Beginner

Food: Pizza and drinks will be provided

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Venue

Improving Office

11675 Rainwater Dr

Suite #100

Alpharetta, GA 30009

Overview: Bravo for Power BI is a free external tool to analyze a model, format DAX, export data, and manage dates table and time intelligence calculations. We will explore this tool’s capabilities, with additional insights on the time intelligence calculations feature that can be customized by creating new templates. Some features are also available in more advanced tools like DAX Studio, but the goal in Bravo is to make them available to all Power BI users providing a simpler user interface that is good for the more common use cases.

Speaker: Marco Russo is a business intelligence consultant and mentor. He first started working on Analysis Services in 1998, back when Analysis Services was known as OLAP Services. Marco wrote several books about Power BI, Analysis Service, and Power Pivot with Alberto Ferrari. They also regularly write articles and white papers that are available on sqlbi.com.

Sponsor: Prologika

PowerBILogo

Atlanta Microsoft BI Group Meeting on October 2nd (30 Power BI Tips and Tricks)

Atlanta BI fans, please join us for the next meeting on Monday, October 2nd, at 6:30 PM ET.  Michael Hewitt will share 30 tips to make you more productive with everything Power BI. Your humble correspondent will help you catch up on Microsoft BI latest. For more details and sign up, visit our group page.

PLEASE NOTE A CHANGE TO OUR MEETING POLICY. WE HAVE DISCONTINUED ONLINE MEETINGS VIA TEAMS. THIS GROUP MEETS ONLY IN PERSON. WE WON’T RECORD MEETINGS ANYMORE. THEREFORE, AS DURING THE PRE-PANDEMIC TIMES, PLEASE RSVP AND ATTEND IN PERSON IF YOU ARE INTERESTED IN THIS MEETING.

Presentation: 30 Power BI Tips and Tricks

Delivery: In-person

Date: October 2nd

Time: 18:30 – 20:30 ET

Level: Beginner to Intermediate

Food: Sponsor wanted

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Venue

Improving Office

11675 Rainwater Dr

Suite #100

Alpharetta, GA 30009

Overview: This session offers 30 tips and tricks born from years of working with Power BI in the trenches and teaching Power BI to others. We’ll cover many facets of Power BI including DAX, Power Query, visualization, external tools, and more! These tips aim to be practical and useful to Power BI users of all skill levels so that you can take them back to work and level up your Power BI solutions.

Speaker: Michael Hewitt leads the BI and Analytics team at Hunt Brothers Pizza, based in Nashville, TN. He enjoys data visualization and working with business users to help them leverage data more effectively in their daily jobs. In previous roles, Michael has delivered Microsoft based BI solutions in the healthcare, distribution, and manufacturing industries. He has also had the privilege of developing and delivering custom Power BI training for the Department of Defense. Outside of work, Michael is a husband, father of 3 children, is a leader in his local church, co-organizes the Nashville Modern Excel and Power BI user group, and enjoys drumming, woodworking, and hiking.

 

Speaker: TBD

PowerBILogo

Excel Connected Tables: Another Opportunity Missed

For years users have been asking for Excel tables (not pivots) connected to SSAS so they can simply browse the data in a tabular format not subjected to the shortcomings of pivots. And for years Excel has been supporting just pivots that natively understand only SSAS OLAP cubes. Is Power BI changing things? Somewhat. Enter refined Excel connected tables. They look great on the surface. You go through a wizard that lets you select and filter your data and you get an Excel table connected to a Power BI dataset. Awesome, right?

Unfortunately, what’s missing is the Field List. To make the feature useful, the user should be able to access the Field List and remove, add, filter, as they can do with pivots. However, once the table is generated, to change it you either must go through the wizard again or change the DAX query (right-click the table, click Table, and then Edit Query). Down the DAX rabbit hole we go again…

As they stand, Excel connected tables are not an interactive browsing tool. Instead, they are geared towards making Power BI datasets accessible in Excel.

Since I lost hope that Excel will discover Tabular and DAX one day, I hope the community will step in to fill in the gap by providing UI like the Pivot Field List so that the user doesn’t need to mess with DAX when all that’s needed is an Excel connected table for ad-hoc data exploration.