Power BI Dynamic M Query Parameters – Another Opportunity Missed for DirectQuery Users

UPDATE 17-Feb-2022: Dynamic query parameters now support popular relational data sources, and I posted an updated blog.

I had recently a Power BI advisory engagement for a large ISV. They had many customers. and each customer data was hosted in its own database for security reasons, with all databases having an identical schema. Power BI reports had to connect using DirectQuery to avoid refreshing data (real-time BI). Naturally, the customer envisioned a single set of reports with the ability to switch the dataset connection to the respective database depending on the user identity and company association. In addition, they wanted to piggyback on the existing database security policies by having each visual call a stored procedure and pass the user identity as a dynamic parameter.

Can Power BI do this? Unfortunately, no. The main architectural issue is that although the model (VertiPaq) and query (Power Query) are bundled in Power BI Desktop (and in a single dataset when published), they are like two separate universes. Naturally, this is not an issue when data is imported, but it becomes important with DirectQuery. As it stands, the only communication between these two layers is when Power Query passes the data back to the model. You can define query parameters, but they are static, meaning that they must be specified on the dataset settings page in Power BI Service. In other words, query parameter can’t reference DAX constructs, such as USERPRINCIPALNAME(), and therefore can’t reference runtime conditions.

That’s why I initially rejoiced when Microsoft announced Dynamic M Query Parameters, a feature that just went into public preview in the Power BI Desktop October 2020 release. Unfortunately, a better name of this feature would have been Kusto Dynamic Parameters since the primary design intent was to support Azure Data Explorer (internally named Kusto). It doesn’t apply to DirectQuery connections to most relational databases.

Although there is no indication, Power BI has two types of connectors: native and M-based. Native connectors target most popular relational data sources: TSQL (Azure Database, SQL Server, Synapse), PLSQL (Oracle), Teradata and relational SAP Hana. Dynamic query parameters won’t work with native connectors. The rest (Microsoft provided and custom) are M-based. If an M connector supports DirectQuery, it should support dynamic query parameters too. For example, besides Azure Data Explorer, other M-based data source that supports DirectQuery and therefore dynamic parameters are Amazon Redshift, Showflake, and Google BigQuery.

I hope Microsoft will revamp this feature to support the popular relational databases and DAX functions, so we can address more involved DirectQuery scenarios and make parameters really dynamics.

Stop Using the Power BI Folder Connector

I know that it’s tempting to use the Folder connector to combine multiple files with the same schema and load them in Power BI Desktop with a few clicks – an idea promoted by Microsoft’s own Power BI Dashboard in a Day (DIAD) training.

But what happens if you want to automate the dataset refresh after uploading your model to Power BI Service? Suddenly, things don’t work so well anymore. The only option is to install a personal gateway on your laptop with all the issues surrounding it.

Instead, I recommend you upload files to Azure Data Lake Storage and use the Power BI Azure Data Lake Storage Gen2 connector. Setting this up will require some assistance from your friendly IT department to set up ADLS and grant you access, but they shouldn’t complain if they are serious about data governance.

Like the Folder connector, you can combine multiple files from an ADLS folder. Moreover, this deployment options offers several benefits over keeping files locally:

  1. Eliminates gateways.
  2. Allows files to be centralized and organized nicely in folders.
  3. Enables integration scenarios, such as a third-party system pushing files on a schedule.
  4. Provides secured access and supervision by IT.
  5. Supports CDM folders in case you ever want to go there, such as to load files from Power BI dataflows or Dynamics CRM.
  6. It’s a highly available, performant, and scalable.

Power Query Merging on GUID Columns

Scenario: You are merging two tables in Power Query on GUID columns with Text data type. In my case, the first table (UserDim) came from Azure SQL DB and the second (systemusers) from Dynamics CRM, but I believe the source is irrelevant. Then, you expand the second table in the next step after the merge, select the columns you need, and see expected results in the preview pane. However, when you apply the changes, the related columns are not populated (empty) in the Power BI data model. In other words, the join fails to find matches.

Solution: I think there is some sort of a bug in Power Query. True, Power Query is case-sensitive so joining on the same text that differs in casing in the two columns would fail the match. However, in this case the guids were all lower case (or at least that’s how they appear in the preview). The only way I could explain the issue is that the refresh somehow converts the guid values to different casing before the join is performed. Anyhow, the problem was solved by adding an explicit step to convert both guid columns to lower case before the merge step.

Solving Configuration Errors with ADF Self-hosted IR

You’ve set up the Azure Data Factory self-hosted integration runtime to access on-prem data sources. You create a linked server, click Test Connection, and then get greeted with an error saying the security context can’t be passed. On the on-prem VM, you use the Integration Runtime Configuration Manager and get a similar error or something to the extent that JSON can’t be parsed. You spent a few hours in trying everything that comes to mind, such as checking firewalls, connectivity from SSMS, but nothing helps.

How do we fix this horrible problem? We double the backslashes in the server name (if you use a named instance) and in the user name (after the domain) on the linked server properties. Apparently, Spark/Databricks has an issue with backslashes.