Posts

Atlanta Microsoft BI Group Meeting on June 3rd (Power BI Direct Lake storage mode)

Atlanta BI fans, please join us in person for the next meeting on Monday, June 3rd at 6:30 PM ET. Shabnam Watson (Consultant and Owner of ABI Cube) will discuss the benefits of using the Direct Lake storage mode in Microsoft Fabric. Your humble correspondent will help you catch up on Microsoft BI latest. CloudStaff.ai will sponsor the event. For more details and sign up, visit our group page.

Presentation: Power BI Direct Lake storage mode: How to achieve blazing fast performance without importing data
Delivery: In-person
Time: 18:30 – 20:30 ET
Level: Beginner/Intermediate
Food: Pizza and drinks will be provided

Agenda:
18:15-18:30 Registration and networking
18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)
19:00-20:15 Main presentation
20:15-20:30 Q&A

Venue
Improving Office
11675 Rainwater Dr
Suite #100
Alpharetta, GA 30009

Overview: Power BI engine in Microsoft Fabric has been significantly revamped to work directly with Delta files in OneLake. This brand-new storage mode is called Direct Lake which allows Power BI to achieve super-fast query performance on billion row datasets without having to import the data into Power BI. Join this session to learn how you can work with Direct Lake with just a few clicks.

Speaker: Shabnam is a business intelligence consultant and owner of ABI Cube, a company that specializes in delivering data solutions using the Microsoft Data Platform. She has over 20 years of experience and is recognized as a Microsoft Data Platform MVP for her technical excellence and community involvement. She is passionate about helping organizations harness the power of data to drive insights and innovation. She has a deep expertise in Microsoft Analysis Services, Power BI, Azure Synapse Analytics, and Microsoft Fabric. She is also a speaker, blogger, and organizer for SQL Saturday Atlanta – BI version, where she shares her knowledge and best practices with the data community.

Sponsor: CloudStaff.ai

PowerBILogo

Correlating Analysis Services Errors with Measures

This blog builds upon my previous “Resolving Tabular Conversion Errors” and applies to Analysis Services in all flavors (Power BI, MD, and Tabular). In the scenario I described in the previous blog, the server at least told us the name of the offending measure in the error description. But sometimes you might not be that lucky. For example, recently I got this error when running a DAX query requesting many measures: “Microsoft OLE DB Provider for Analysis Services.” Hresult: 0x80004005 Description: “MdxScript(Model) (2000, 133) Failed to resolve name ‘SYNTAXERROR’. It is not a valid table, variable, or function name.” All we know is that there is a syntax error in some measure but good luck finding it if you have hundreds of measures in the query and your model. However, the (2000,133) section references the line number and column number in the MDX script (Yeap, MDX even if you use Tabular), so if we can get the script, we might be able to correlate the error.

Getting that script is elusive as the only way I know off for Tabular models is to trace the “Execute MDX Script Begin” or “Execute MDX Script End” events. But these events are generated only the first time a user with a given set of role permissions connects to a cube after processing, clearing the cache, or restarting the server. So, after you connect the SQL Server Profiler to the cube and check either of these two events (you must click Show All Events because the events are not shown by default), you can execute Clear Cache or process a table. Then, connect to the cube and execute any MDX or DAX query. In the Profiler, locate the “Execute MDX Script Begin” event. The payload should start with CALCULATE followed by all measure definitions.

CALCULATE;
CREATE MEMBER CURRENTCUBE.Measures.[__Default measure] AS 1; ALTER CUBE CURRENTCUBE UPDATE DIMENSION Measures, Default_Member = [__Default measure];
CREATE
MEASURE'Date'[Days Current Quarter to Date]=COUNTROWS( DATESQTD( 'Date'[Date]))
MEASURE'Date'[Days in Current Quarter]=COUNTROWS( DATESBETWEEN( 'Date'[Date], STARTOFQUARTER( LASTDATE('Date'[Date])), ENDOFQUARTER('Date'[Date])))
MEASURE'Internet Sales'[Internet Distinct Count Sales Order]=DISTINCTCOUNT([Sales Order Number])
….

Copy that payload and paste it in an editor that has line numbers, such as Notepad++. Then go to the line number (you can press Ctrl+G in Notepad) mentioned in the error description.

Using this technique, I was able to narrow down the measure and discovered that I’ve missed a comma in the DIVIDE DAX function. I’ve made the change in the Tabular Editor, but it didn’t catch the syntax error.

Refreshing Power BI Datasets from SSIS

Scenario: You use SSIS to load data for on-prem BI solution. As a last step of the ETL pipeline, you want to refresh a Power BI dataset. There’s quite a bit of misinformation on the Internet about how to do this, hence this blog.

Solution: If the dataset is hosted in Power BI Pro workspace, the only way is to use the Power BI REST APIs and there are some good examples out there using PowerShell or Microsoft Automate flows. However, if the dataset is hosted in a Premium-per-user (PPU) or Premium workspace, you can use the SSIS built-in Analysis Services Processing Task. And, no, you don’t need third-party components.

An unattended process can authenticate against Power BI using either a Power BI-licensed account or service principal. I couldn’t get service principals to work with Power BI datasets. More than likely, this is because the service principal needs to be added as an Analysis Services administrator, but we can’t do with Power BI (we can with Azure Analysis Services though). And so, the only option is to use a Power BI regular account (email and password). However, your organization probably uses multi-factor authentication (MFA) to secure access to cloud services. Because the SSIS process will run unattended, no one will be on a lookout to plug in the authentication code. Therefore, you must harass your helpful system administrator to provide you with a user account that is not enabled for MFA.

And so, the steps are:

  1. Provision a dedicated account in Azure Active Directory. Ideally, the account should have a password that doesn’t expire.
  2. Assign the account (or the AAD Security group it belongs to as a best practice) the Contributor (or higher Power BI role) to the workspace where the dataset resides.
  3. Enable the Power BI XMLA feature as ReadWrite.
  4. On your dev machine, install both the 32-bit and 64-bit versions of the latest Analysis Services MSOLAP providers. You need the 32-bit provider to test in Visual Studio because VS refuses to go 64-bit. However, if you run the package under SQL Agent, you’d need the 64-bit provider.
  5. In Visual Studio, add a connector to the SSIS project that uses SSIS MSOLAP100 Analysis Services provider, which is just a wrapper on top of the SSAS native MSOLAP provider. Configure the connector to connect to the XMLA dataset endpoint (you can copy it from the dataset settings in Power BI Service).
  6. Configure the SSIS connector to use the “Use a specific user name and password” and plug in the credentials of the dedicated account you configured in step 1.

Gotchas: For some obscure reason, the “Test Connection” button might generate an error, but the processing task should work. Unlike what you might believe, the “Allow saving password” option doesn’t persist the password for security reasons. So, you either need to use an SSIS configuration or retype the password if you close and reopen Visual Studio. Once you deploy the SSIS project to the Integration Services catalog and schedule it with the SQL Agent, make sure to update the connector’s connection string to store the password inside the SQL Agent task.

Prologika Newsletter Summer 2022

The workhorse of any modern BI solution is the semantic model that provides unparallel performance and contains business logic and security roles. Microsoft BI gives us three options to host semantic model: SQL Server Analysis Services (SSAS), Azure Analysis Services (AAS), and Power BI.  This newsletter explains why it might be beneficial to consider AAS and includes a script for automating resuming, processing, and synchronizing a model hosted on AAS.

Why Azure Analysis Services?

Microsoft BI practitioners have three options for hosting semantic models: SSAS (on prem), Azure Analysis Services (cloud), and Power BI (cloud). AAS is somewhat caught between a rock and a hard place. Given that Power BI gets the most attention for cloud deployment, why would you consider AAS at all? There are two main reasons:

  1. Cost – Organizational semantic models might require a lot of memory and crunching power. Hosting them on AAS might be more cost effective. For example, AAS S4 runs at around $5,000 which at the same price point as Power BI Premium P1. However, it gives you 100 GB of RAM and 20 cores, whereas P1 has only 25 GB and 8 cores.
  2. Scaling out – A feature unique to AAS is ability to scale out to multiple query replicas. This is not an option with Power BI Premium, and it requires quite a bit of setup with SSAS. However, AAS makes scaling out easy by just changing a slider. And once you’re done, you can pause it, so it doesn’t incur cost!

Automating the solution

Scaling out proved to be a useful feature lately when a client wanted to process massive queries in parallel. We cloned the model to AAS and wrote an ETL job to parallelize the query execution.

Note that the number of replicas depends on the data region and pricing level. For example, only East US 2 and West US support up to 7 query replicas up to S4. Another thing to watch for is that it’s not enough to just process the model on a scale-out farm. You’d need also to synchronize it across the query replicas. This could be done manually in the Azure Portal or automated, such by using the PowerShell script below that you can plug in a SQL Agent job. The script uses a regular AAD account which has admin rights to the server. You can also use a service principal, but I opted for a regular account because Microsoft removed the option for no expiration date for the client secret (the maximum lifetime of a client secret now is two years).

Import-Module Az.AnalysisServices
$password = "<account password>" | ConvertTo-SecureString -asPlainText -Force
$username = "craas@<domain>.com"
$aasendpoint = "asazure://aspaaseastus2.asazure.windows.net/crliveaas1"
$aasendpointmgmt = "asazure://aspaaseastus2.asazure.windows.net/crliveaas1:rw"
$TenantId = "<tenant id>"
$credential = New-Object System.Management.Automation.PSCredential($username,$password)
$defaultProfile = Connect-AzAccount -Credential $credential -Tenant $TenantId

Set-AzContext -Tenant $TenantId -DefaultProfile $defaultProfile
$server = Get-AzAnalysisServicesServer -ResourceGroupName "crliveaas_rg" -Name "crliveaas1" -DefaultProfile $defaultProfile
if ($server.State -eq "Paused")
{
    Resume-AzAnalysisServicesServer -Name "crliveaas1" -ResourceGroupName "crliveaas_rg"  
    #process database; ClearValues removes the data to reduce the memory footprint
    Invoke-ProcessASDatabase -Server $aasendpointmgmt -DatabaseName "<databasename>" -RefreshType "ClearValues" -Credential $credential
    Invoke-ProcessASDatabase -Server $aasendpointmgmt -DatabaseName "<databasename>" -RefreshType "Full" -Credential $credential

    # sync database
    Add-AzAnalysisServicesAccount  -Credential:$credential -RolloutEnvironment:"aspaaseastus2.asazure.windows.net"
    Sync-AzAnalysisServicesInstance -Instance $aasendpointmgmt -Database "<databasename>" -PassThru
}

Conclusion

When it comes to cloud deployment of Analysis Services semantic models, Power BI is preferable because you’re always on the latest features. However, Azure Analysis Services can help you reduce cost and scale out query execution – feature that Power BI doesn’t support.


Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

logo

A Case for Azure Analysis Services

Microsoft BI practitioners have three options for hosting semantic models: SSAS (on prem), Azure Analysis Services (cloud), and Power BI (cloud). AAS is somewhat caught between a rock and a hard place. Given that Power BI gets the most attention for cloud deployment, why would you consider AAS at all? There are two main reasons:

  1. Cost – Organizational semantic models might require a lot of memory and crunching power. Hosting them on AAS might be more cost effective. For example, AAS S4 runs at around $5,000 which at the same price point as Power BI Premium P1. However, it gives you 100 GB of RAM and 20 cores, whereas P1 has only 25 GB and 8 cores.
  2. Scaling out – A feature unique to AAS is ability to scale out to multiple query replicas. This is not an option with Power BI Premium, and it requires quite a bit of setup with SSAS. However, AAS makes scaling out easy by just changing a slider. And once you’re done, you can pause the instance, so it doesn’t incur cost!

Scaling out proved to be a useful feature lately when a client wanted to process massive queries in parallel. We cloned the model to AAS and wrote an ETL job to parallelize the query execution.

Note that the number of replicas depends on the data region and pricing level. For example, only East US 2 and West US support up to 7 query replicas up to S4. Another thing to watch for is that it’s not enough to just process the model on a scale-out farm. You’d need also to synchronize it across the query replicas. This could be done manually in the Azure Portal or automated, such by using the PowerShell script below that you can plug in a SQL Agent job. The script uses a regular AAD account which has admin rights to the server. You can also use a service principal, but I opted for a regular account because Microsoft removed the option for no expiration date for the client secret (the maximum lifetime of a client secret now is two years).

Import-Module Az.AnalysisServices
$password = "<account password>" | ConvertTo-SecureString -asPlainText -Force
$username = "craas@<domain>.com"
$aasendpoint = "asazure://aspaaseastus2.asazure.windows.net/crliveaas1"
$aasendpointmgmt = "asazure://aspaaseastus2.asazure.windows.net/crliveaas1:rw"
$TenantId = "<tenant id>"
$credential = New-Object System.Management.Automation.PSCredential($username,$password)
$defaultProfile = Connect-AzAccount -Credential $credential -Tenant $TenantId

Set-AzContext -Tenant $TenantId -DefaultProfile $defaultProfile
$server = Get-AzAnalysisServicesServer -ResourceGroupName "crliveaas_rg" -Name "crliveaas1" -DefaultProfile $defaultProfile
if ($server.State -eq "Paused")
{
    Resume-AzAnalysisServicesServer -Name "crliveaas1" -ResourceGroupName "crliveaas_rg"  
    #process database, first clear the data so processing doesn't go over memory limit
    Invoke-ProcessASDatabase -Server $aasendpointmgmt -DatabaseName "<databasename>" -RefreshType "ClearValues" -Credential $credential
    Invoke-ProcessASDatabase -Server $aasendpointmgmt -DatabaseName "<databasename>" -RefreshType "Full" -Credential $credential

    # sync database
   Add-AzAnalysisServicesAccount  -Credential:$credential -RolloutEnvironment:"aspaaseastus2.asazure.windows.net"
   Sync-AzAnalysisServicesInstance -Instance $aasendpointmgmt -Database "<databasename>" -PassThru
}

Chasing SSAS Connection Timeouts

Suppose you have a Tabular model and you send a massive DAX query to it that could run for hours, such as to calculate many measures (in our case hundreds) for each customer overnight so that you can cache the results and delight the user with super-fast lookups. This issue could also apply to Multidimensional although in this case Tabular was used. The server times out sporadically the query after a random execution time. You have changed all possible connection timeout options (SSAS ServerTimeout, SSIS connection timeout, etc.) to no avail. In fact, if you have scheduled an Agent job that calls an SSIS package that executes the query, the package doesn’t register the exception and continues executing indefinitely, but a Profiler trace (or XEvents session) shows that the server raises a Connection Timeout error.

How to fix this horrible issue? Change the two undocumented settings in the MSMDSRV.INI file from 60,000 to 600,000.

<ServerSendTimeout>600000</ServerSendTimeout>

<ServerReceiveTimeout>600000</ServerReceiveTimeout>

Solving RLS Gotchas

Scenario: You’ve created a beautiful, wide-open Tabular model. You use USERELATIONSHIP() to switch relationships on and off. Everything works and everyone is pleased. Then RLS sneaks in, such as when external users need access, and you must secure on some dimension table. You create a role, specify a row filter, test the role, and get greeted with:

The UseRelationship() and CrossFilter() functions may not be used when querying ‘<dimension table>’ because it is constrained by row-level security defined on ‘<dimension table>’ or related tables.

Analysis: There is a long-standing Tabular limitation that prevents USERELATIONSHIP for an added level of security which may be triggered even if USERELATIONSHIP doesn’t enable a relationship on the security propagation path. This is done to prevent information disclosure in case there is some other active relationship (since UseRelationship would disable security propagation across the other relationship). Unfortunately, the current design is “no inactive relationship, no problem”. A better option would have been to introduce a metadata table-based (or relationship-based) attribute to remove this rule.

Workaround: Since currently there is no magic switch you need to find a workaround depending on your specific case. For example, in one case where only external users were affected, I added a new set of measures. I didn’t change the original measures for two reasons: a) avoid re-testing the entire model and b) dynamic relationship always underperform materialized relationships. The new set could use INTERSECT (or TREATAS if you on SQL Server 2016+) to replace USERELATIONSHIP. For example, instead of:

USERELATIONSHIP(Policy[Branch Number], Division[Branch Number])

You could use:

INTERSECT(VALUES(Division[Branch Number]), VALUES(Policy[Branch Number]))

Note that you might not get exactly the same behavior because materialized and dynamic relationship differ in how the missing members are handled (see my blog “Propagating DAX BLANK() Over Relationships” to understand this better).

Showing Database Images in Power BI and Tabular

The Power BI image-rendering visualizations, such as Table or Card, expect image links to public servers hosting the images with anonymous permissions. This has obvious shortcomings. Can we load images from a database or Power BI data table? You bet, as Jason Thomas has demonstrated a long time ago. Here are the steps I followed to show the images from the Production.ProductPhoto table in the AdventureWorks2012 (or later) database. If you want to embed a few images into a Power BI data table (instead of an external database), you can convert them manually to Base64 using any of the online image converters, such as https://codebeautify.org/image-to-base64-converter and embed the resulting string into a Power BI data table (the Enter Data feature). Gerhard Brueckl takes this one step further by showing you how to automate the base64 conversion with many images.

  1. Import the table with the image column as usual in Power BI.
  2. In Power Query, change the data type of the image column to Text.
  3. Add a new custom column that prefixes the base64 string with “data:image/jpeg;base64,” (not the comma at the end) for jpeg images or “data:image/png;base64,” for png.
    = Table.AddColumn(#”Changed Type”, “ProductImageEncoded”, each “data:image/jpeg;base64,” & [ThumbNailPhoto]).
    The ProductImageEncoded column below shows what the final Power Query transformation should look like. Click Close & Apply to import the table in the data model as you’re done with the transformation part.
  4. In Power BI Desktop, select the ProductImageEncoded field in the Fields pane. Assuming the new ribbon, in the Column Tools ribbon, change the field category to Image URL.
  5. Add the ProductImageEncoded field to a Table, Card, Multicard, or Slicer visuals.

Power BI Time Adventures

A customer reported that a Power BI date filter/slicer set to a specific date in Date dimension, let’s say April 24, 2020, doesn’t return some rows from the fact table that match that date. Upon some digging, the data was imported from Dynamics CRM and the source date column had UTC time. Power Query showed Date/Time/Timezone as a data type. However, the developer has converted the corresponding field in the model to the Power BI Date data type to remove the time portion. And indeed, the Data View would show that date as April 24, 2020 (without the time portion). So, why no match?

The xVelocity storage engine (the storage engine behind Power BI and Analysis Services Tabular) has only one data type for dates: DateTime. If you convert the field in the data model to Date, it still keeps the time portion (UTC or not) and it doesn’t change the column storage. It just changes to the column formatting to show the date only.

To get rid off the time portion, cast to Date in the data source (e.g. in a wrapper SQL view) or change the data type of the column in Power Query to Date. This will “remove” the time portion. In reality, xVelicity will convert the time to midnight.

Power BI Embedded, Service Principals, and AAS

In my previous post “Power BI Embedded, Service Principals, and SSAS“, I discussed how you can integrate Power BI Embedded (App Owns Data) configured for service principal authentication with SSAS to pass the effective user identity. One important observation is that you can use this approach with both internal and external users. For internal users, the Power BI gateway (running under an account that has admin rights to the SSAS instance) passes the effective user identity under the EffectiveUserName connection string setting. For internal users, the effective user identity maps to the user UPN, such as john.doe@prologika.com, so that AAS can map it to the corresponding AAS account. For external users, you can configure the gateway for CustomData, and pass whatever you want as an effective user identity.

Suppose that one day you migrate your code to Azure Analysis Services (AAS)? AAS. Will it work? Unfortunately, not. Since there is no gateway between Power BI and AAS, there isn’t a layer to authenticate using a trusted account. So, the Power BI team has decided to go only with CustomData instead and Power BI Embedded supports a special parameter which only works for AAS . Although the documentation doesn’t emphasize this difference, it has an important paragraph “The only way to have dynamic RLS (which uses dynamic values for filter evaluation) in Azure Analysis Services, is using the CUSTOMDATA() function”. Let’s break this down.

  1. You must use the Object ID of the service principal account when you construct your effective identity. See my previous blog of how to obtain that identifier. Attempting to pass anything other than the Object ID will result in a Forbidden error when the code attempts to obtain the embed token by calling client.Reports.GenerateTokenInGroup().
    var identity = new EffectiveIdentity(“<Object ID GUID>”, new List<string> { report.DatasetId }, customData:“someuser@acme.com”);
  2. You must use the customData parameter to pass whatever identifier your AAS row-level security will use to authorize the interactive user. DAX can obtain this identifier from the CUSTOMDATA() function.
  3. You must add the service principal Object ID to each AAS security role in which the user needs to be evaluated.

Your Power BI Embedded App Owns Data implementation will need different code for SSAS and AAS. The AAS version relies on CUSTOMDATA for handling row-level security.

Although this implementation path is fundamentally different from SSAS, it will work with external users that are not part of your Azure AD. But users registered in Azure AD cannot be just added to AAS roles. This will be pointless because you won’t be able to pass their identity under EffectiveUserName and AAS won’t be able to evaluate them as AAD users. So, both internal and external users must go somehow through CUSTOMDATA.