Power BI Embedded, Service Principals, and SSAS

Power BI Embedded supports two ways that your custom app can authenticate to Power BI using a trusted account: master account (the original option) and more recently service principle. Service principal authentication is preferred because it doesn’t require storing and using credentials of a Power BI Pro account. Configuring the service principal and embedding reports with imported data is easy. Not so much with embedding reports connected to on-prem Analysis Services models, mainly because of documentation gaps. Here are some notes you might find useful that I harvested from a recent engagement.

  1. Unlike what the documentation states that only SSAS models with RLS requires it, you must grant the service principal ReadOverrideEffectiveIdentity permission. Otherwise, the service principal can’t delegate the user identity to the gateway. So, the gateway admin must call the Gateways – Add Datasource User API.

    TIP: Don’t write code but use the documentation page to call the API (isn’t this nice!). Construct the body to look like this:
    {

“identifier”: “3d9b93c6-7b6d-4801-a491-1738910904fd“,

“datasourceAccessRight”: “ReadOverrideEffectiveIdentity”,

“principalType”: “App”

}

What the API page doesn’t tell you is what you need to use for the identifier in the request body. Your first attempt might be to use ApplicationID, but you’ll get greeted with error ” DMTS_PrincipalsAreInvalidError”. Instead, you must use the object id (in Azure portal, go to the app registration, and then click the app name in the “Managed Application in local directory” property).

Then, on the next page, copy Object ID.

  1. Once the permission is granted, your custom app must construct an effective identity, such as by using this code. Notice you need to get to the report dataset to check how it’s configured. The GetDatasetByIdInGroup API requires the workspace identifier. The code assumes that the dataset is in the report’s workspace. If you need to embed reports connected to shared datasets which reside in other workspaces, you have a problem because the workspace id is not available in the report object. The best way your app can address this might be to maintain a reference map for these reports (reportId, workspaceId). I hope Microsoft changes the GetDatasetByIdInGroup to not require a workspace (not sure why is needed at all with guids).
  2. var dataset = client.Datasets.GetDatasetByIdInGroup(WorkspaceId, report.DatasetId);
  3. var IsEffectiveIdentityRequired = dataset.IsEffectiveIdentityRequired;
  4. var IsEffectiveIdentityRolesRequired = dataset.IsEffectiveIdentityRolesRequired;
  5. GenerateTokenRequest generateTokenRequestParameters = null;
  6. if (!(bool)IsEffectiveIdentityRequired)
  7. // reports with imported data that don’t require RLS
  8. {
  9. // Generate Embed Token for reports without effective identities.
  10. generateTokenRequestParameters = new GenerateTokenRequest(accessLevel: “view”);
  11. }
  12. else
  13. // reports connecting to RLS datasets and Analysis Services
  14. {
  15. var identity = new EffectiveIdentity(“<replace with the interactive user email, e.g. john@acme.com>”,
  16.     new List<string> { report.DatasetId });
  17. var roles = “”; // if you want the user to evaluated as a member of a certain RLS role, replace with a comma-delimited list of roles
  18. if (!string.IsNullOrWhiteSpace(roles))
  19. {
  20. var rolesList = new List<string>();
  21. rolesList.AddRange(roles.Split(‘,’));
  22. identity.Roles = rolesList;
  23. }
  24. // Generate Embed Token with effective identities.
  25. generateTokenRequestParameters = new GenerateTokenRequest(accessLevel: “view”,
  26.     identities: new List<EffectiveIdentity> { identity });
  27. }

    Notice that in line 16, your app must pass a valid Windows login to Analysis Services because behind the scenes the gateway will be append EffectiveUserName to the connection string.

Atlanta MS BI and Power BI Group Meeting on February 3rd

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on February 3rd, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Shabnan Watson will discuss how to apply aggregations to Power BI DirectQuery datasets to improve report performance.  Melissa will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Aggregations in Power BI
Date:February 3rd, 2020
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Aggregations are one of the most important optimization methods for managing big datasets in Power BI. Combined with Direct Query storage mode, they allow big datasets to be analyzed efficiently by answering high level analytical queries quickly from memory while sending more detailed queries back to the source database. In this session, you will learn about the concept of aggregations, different table storage modes in Power BI, different kinds of aggregation tables, how to configure aggregation tables so that they can answer high level user queries, and finally how to use tools such as DAX Studio or Extended Events to determine if the aggregations are actually being used.
Speaker:Shabnam Watson is a Business Intelligence consultant with 18 years of experience developing data warehouse and BI solutions. Her work focus within the Microsoft BI Stack has been on Analysis Services and Power BI. She is an active member of PASS community and has spoken at PASS Summit, PASS SQL Saturdays, PASS Women In Technology Virtual Chapter, and other Local user groups. She is one of the organizers of SQL Saturday Atlanta and SQL Saturday Atlanta BI Edition. She holds a master’s degree in computer science, a bachelor’s degree in Computer Engineering, and a Certified Business Intelligence Professional (CBIP) certification by The Data Warehouse Institute (TDWI).
Sponsor:Bad data is bad business. Melissa helps organizations profile, cleanse and verify, dedupe and enrich all their people data (name, address, email and phone number) and more.  With clean, accurate and up-to-date customer information, organizations can monetize Big Data, improve sales and marketing, reduce costs and drive business insight. https://www.melissa.com/
Prototypes with PizzaTBD

PowerBILogo

Embracing Automated Machine Learning (AutoML)

With the growing demand for predictive analytics, Automated Machine Learning (AutoML) aims to simplify and democratize predictive analytics so business users can create their own predictive models. The promise of AutoML is to bring predictive analytics to business users, just like Power BI democratizes data analytics, Power Apps democratizes app dev, and Power Query democratizes data shaping and transformation.

As a business user, the two most popular options for applying Automated Machine Learning for predictive analytics are Power BI and AzureML. Behind the scenes, Power BI AutoML uses the automated machine learning feature of AzureML but there are differences and I summarize below the most important ones.

Power BI AutoMLAzureML AutoML
LicensingPower BI PremiumAzure ML (Enterprise Edition recommended)
ContainerDataflowExperiment
Power QueryAvailableNot available
Supported data sourcesManyA few (local files, Azure SQL DB, ADLS, and a few more)
ModelNot Accessible (Power BI handles everything)Accessible
Web service endpointNot available outside Power BIAvailable for app integration
ScoringApply the model to entityVarious options (Notebooks, SDK, custom integration)

To me, the best solution would have been the combination of both. I like Power Query for sourcing, shaping and transforming the data, but I also like the flexibility that AzureML brings. Unfortunately, you can’t mix and match. It appears that AzureML has decided to roll out their own data connectivity mechanism and as a result, it supports a limited number of data sources (for example, on-prem data sources are not accessible). Of course, this will probably change soon as the product evolves.

I’ve done recently some work with Azure ML Studio (https://ml.azure.com/), and I’m impressed. Microsoft has learned important lessons from the previous AzureML (now called “classic”) and greatly enhanced the product. If you’re looking for a SaaS ML toolset that targets both business users and data scientists, AzureML should be on the top of your list. Speaking of its AutoML feature, the main advantages that it brings for predictive analytics are:

  • Determining the model type – classification, regression, and time series forecasting (the last one is not available yet in Power BI)
  • Automatic featurization
  • Selecting the best algorithm – For example, the screenshot below shows how AzureML has tested various algorithms and determined that VotingEnsemble performs the best.

Even if you’re a data scientist, the best algorithm selection feature alone justifies giving AutoML a try – if not for anything else but to select the best algorithm so that you don’t have spend enormous time testing different algorithms.

Get Power BI Training at Power Platform World Tour

Register for my full-day academy training at #PowerPlatformWT in Atlanta on Feb 10th for only $599 and learn how Power BI can bring your data to life!

011520_2018_GetPowerBIT1.png

Stop Using JavaScript for Data Security

As the poet said, “let me count the ways”. I’ve been amazed by the extent some organizations would go into inconveniencing their employees in the pursuit of better security while they do such dumb things. In one case, an organization would automatically log out login sessions after a certain period of inactivity, allow employees to access Power BI only on company’s approved devices, and prevent developers from accessing servers. At the same time, the same organization would configure an Internet proxy server for Basic authentication where user passwords are transmitted in plain text (not even https)! All the hacker must do is to plug into the corporate network and intercept the login passwords.

Yet, another organization would require developers to tunnel into a VDI environment, from where they will tunnel one more time into an Azure VM, before they can access Azure resources, such as an Azure SQL DB. Wasn’t the Cloud supposed to make things easier? Yet, developers within the same organization would use JavaScript to apply data security. Since this is the second time I see this, please repeat after me.

Thou shall not use the Power BI Embedded JavaScript APIs to apply data security.

In this case, the developer has decided to pass filters to the Power BI Embedded configuration object (notice the filters property) to restrict the data. The problem is that JavaScript code is not secure. This code will run on the client side and in the user’s browser session. All it takes is to put the browser in a debug mode and nuke the filters.

The Power BI Embedded JavaScript APIs were never meant as a replacement of data security (RLS). Instead, the scenario here is to let the developer provide an alternative filtering UI that could replace the default Power BI filter pane. Defaulting a filter, such as a Date filter, is OK since overwriting the filter is not a big deal. However, setting up a hidden filter with the customer identifier to limit data to that customer is not. This code must execute on the server.

Power BI Pro Storage Quota

Although Power BI has been evolving for almost five years now, basic concepts are sometimes worth revisiting. Recently, I had a discussion regarding the Power BI Pro storage quota on the Power BI MVP list and I want to share the conclusions confirmed by Microsoft.

For workspaces in shared capacity licensed with Power BI Pro (not a workspace in a Premium capacity):

  1. There is a per-workspace storage limit of 10 GB. So, My Workspace gets 10 GB and so does any org workspace.
  2. There is also an unofficial cross-workspace aggregate quota of 10 GB * the number of Pro User Licenses intended as a backstop to prevent abuse so that a Pro user doesn’t keep on indefinitely creating workspaces to get new chunks of 10 GB. So, if you have 50 Power BI Pro users, the aggregate cross-workspace storage quota would be 500 GB irrespective if only one or multiple Pro users contribute. You won’t see the cross-workspace quota in the Power BI Service UI and it’s not exposed through service code.

BTW, you should ignore the “Manage data storage in Power BI workspaces” document until it’s been updated (the current timestamp is 12/20/2018). As it stands, this document contains wrong and incomplete information. For example, sharing datasets, reports, dashboards should have no effect on the workspace storage quota for consumers.

Atlanta MS BI and Power BI Group Meeting on January 6th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on January 6th, Mondayat 6:30 PM at the Microsoft office in Alpharetta. I’ll introduce to Power BI Premium Automated Machine Learning (AutoML). Prologika will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Power BI Automated Machine Learning (AutoML)
Date:January 6th, 2020
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:With the growing demand for predictive analytics, Automated Machine Learning (AutoML) aims to simplify this process and democratize Machine Learning so business users can create their own basic predictive models. Join this presentation to learn how to apply AutoML in Power BI Premium to predict the customer probability to purchase a product. I’ll show you the end-to-end AutoML process, including:

·       Create a dataflow

·       Choose a field to predict

·       Choose a model type

·       Select input variables (features)

·       Train the model

·       Apply the model to new data

·       Bonus: Integrate Power BI with AzureML

Speaker:Through his Atlanta-based company Prologika (https://prologika.com), a Microsoft Gold Partner in Data Analytics, Teo Lachev helps organizations make sense of their most valuable asset: their data. His strategy formulation, trusted advisory and mentoring, design and implementation services empower clients to apply effectively data analytics in order to understand, improve, and transform their business processes. Teo has authored and co-authored several books on organizational and self-service data analytics, and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Teo has been a Microsoft Most Valued Professional (MVP) Data Platform since 2004.
Sponsor:Prologika is one of the most trusted names in Data Analytics. Our clients, from small businesses to Fortune 100 enterprises, derive tremendous value from our services. Our mission is to help organizations make sense of data by applying the latest technologies for descriptive and predictive analytics and get actionable insights. Your organization will spend less time mining for information and be better equipped to make sound business decisions. https://prologika.com

PowerBILogo

Applied Power BI Book (5th Edition)

I’m excited to announce the fifth edition of my Applied Microsoft Power BI book! When the first edition was published  in January 2016, it was the first Power BI book at that time and it had less than 300 pages.  Since then, I helped many companies adopt or transition to Power BI and taught hundreds of students. It’s been a great experience to witness the momentum surrounding Power BI and how the tool has matured over time. As a result, the book also got thicker and it now stands at 528 pages. However, I believe what’s more important is that this book provides systematic, yet dependent, view by showing what Power BI can do for four types of users (business users, analysts, pros, and developers).

To my understanding, this is the only Power BI book that gets annual revisions to keep it up to date with this ever changing technology! Because I had to draw a line somewhere, Applied Microsoft Power BI (5th edition) covers all features that are in preview or released by December 2019. As with my previous books, I’m committed to help my readers with book-related questions and welcome all feedback on the book discussion forum on the book page. While you are there, feel free to check out the book resources (sample chapter, front matter, and more). I also encourage you to follow my blog at https://prologika.com/blog and subscribing to my newsletter at https://prologika.com to stay on the Power BI latest.

Bring your data to life for the forth time! Keep on reading, learning, and Happy New Year!

apbi5