Stop Using JavaScript for Data Security

As the poet said, “let me count the ways”. I’ve been amazed by the extent some organizations would go into inconveniencing their employees in the pursuit of better security while they do such dumb things. In one case, an organization would automatically log out login sessions after a certain period of inactivity, allow employees to access Power BI only on company’s approved devices, and prevent developers from accessing servers. At the same time, the same organization would configure an Internet proxy server for Basic authentication where user passwords are transmitted in plain text (not even https)! All the hacker must do is to plug into the corporate network and intercept the login passwords.

Yet, another organization would require developers to tunnel into a VDI environment, from where they will tunnel one more time into an Azure VM, before they can access Azure resources, such as an Azure SQL DB. Wasn’t the Cloud supposed to make things easier? Yet, developers within the same organization would use JavaScript to apply data security. Since this is the second time I see this, please repeat after me.

Thou shall not use the Power BI Embedded JavaScript APIs to apply data security.

In this case, the developer has decided to pass filters to the Power BI Embedded configuration object (notice the filters property) to restrict the data. The problem is that JavaScript code is not secure. This code will run on the client side and in the user’s browser session. All it takes is to put the browser in a debug mode and nuke the filters.

The Power BI Embedded JavaScript APIs were never meant as a replacement of data security (RLS). Instead, the scenario here is to let the developer provide an alternative filtering UI that could replace the default Power BI filter pane. Defaulting a filter, such as a Date filter, is OK since overwriting the filter is not a big deal. However, setting up a hidden filter with the customer identifier to limit data to that customer is not. This code must execute on the server.

Power BI Pro Storage Quota

Although Power BI has been evolving for almost five years now, basic concepts are sometimes worth revisiting. Recently, I had a discussion regarding the Power BI Pro storage quota on the Power BI MVP list and I want to share the conclusions confirmed by Microsoft.

For workspaces in shared capacity licensed with Power BI Pro (not a workspace in a Premium capacity):

  1. There is a per-workspace storage limit of 10 GB. So, My Workspace gets 10 GB and so does any org workspace.
  2. There is also an unofficial cross-workspace aggregate quota of 10 GB * the number of Pro User Licenses intended as a backstop to prevent abuse so that a Pro user doesn’t keep on indefinitely creating workspaces to get new chunks of 10 GB. So, if you have 50 Power BI Pro users, the aggregate cross-workspace storage quota would be 500 GB irrespective if only one or multiple Pro users contribute. You won’t see the cross-workspace quota in the Power BI Service UI and it’s not exposed through service code.

BTW, you should ignore the “Manage data storage in Power BI workspaces” document until it’s been updated (the current timestamp is 12/20/2018). As it stands, this document contains wrong and incomplete information. For example, sharing datasets, reports, dashboards should have no effect on the workspace storage quota for consumers.

Atlanta MS BI and Power BI Group Meeting on January 6th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on January 6th, Mondayat 6:30 PM at the Microsoft office in Alpharetta. I’ll introduce to Power BI Premium Automated Machine Learning (AutoML). Prologika will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Power BI Automated Machine Learning (AutoML)
Date:January 6th, 2020
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:With the growing demand for predictive analytics, Automated Machine Learning (AutoML) aims to simplify this process and democratize Machine Learning so business users can create their own basic predictive models. Join this presentation to learn how to apply AutoML in Power BI Premium to predict the customer probability to purchase a product. I’ll show you the end-to-end AutoML process, including:

·       Create a dataflow

·       Choose a field to predict

·       Choose a model type

·       Select input variables (features)

·       Train the model

·       Apply the model to new data

·       Bonus: Integrate Power BI with AzureML

Speaker:Through his Atlanta-based company Prologika (https://prologika.com), a Microsoft Gold Partner in Data Analytics, Teo Lachev helps organizations make sense of their most valuable asset: their data. His strategy formulation, trusted advisory and mentoring, design and implementation services empower clients to apply effectively data analytics in order to understand, improve, and transform their business processes. Teo has authored and co-authored several books on organizational and self-service data analytics, and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Teo has been a Microsoft Most Valued Professional (MVP) Data Platform since 2004.
Sponsor:Prologika is one of the most trusted names in Data Analytics. Our clients, from small businesses to Fortune 100 enterprises, derive tremendous value from our services. Our mission is to help organizations make sense of data by applying the latest technologies for descriptive and predictive analytics and get actionable insights. Your organization will spend less time mining for information and be better equipped to make sound business decisions. https://prologika.com

PowerBILogo

Applied Power BI Book (5th Edition)

I’m excited to announce the fifth edition of my Applied Microsoft Power BI book! When the first edition was published  in January 2016, it was the first Power BI book at that time and it had less than 300 pages.  Since then, I helped many companies adopt or transition to Power BI and taught hundreds of students. It’s been a great experience to witness the momentum surrounding Power BI and how the tool has matured over time. As a result, the book also got thicker and it now stands at 528 pages. However, I believe what’s more important is that this book provides systematic, yet dependent, view by showing what Power BI can do for four types of users (business users, analysts, pros, and developers).

To my understanding, this is the only Power BI book that gets annual revisions to keep it up to date with this ever changing technology! Because I had to draw a line somewhere, Applied Microsoft Power BI (5th edition) covers all features that are in preview or released by December 2019. As with my previous books, I’m committed to help my readers with book-related questions and welcome all feedback on the book discussion forum on the book page. While you are there, feel free to check out the book resources (sample chapter, front matter, and more). I also encourage you to follow my blog at https://prologika.com/blog and subscribing to my newsletter at https://prologika.com to stay on the Power BI latest.

Bring your data to life for the forth time! Keep on reading, learning, and Happy New Year!

apbi5

Prologika Newsletter Winter 2019

Happy Holidays! I hope you’re enjoying this special time of the year. A few months ago, I did an assessment for a large company that was advised by an undisclosed source that they should use their Dynamics Financials and Operations (F&O) system as a data warehouse. Recently, I came across a similar wish but this time to use SAP as EDW. Can we do this? But before I give you my opinion, I’m excited to announce the availability of the fifth edition of my “Applied Microsoft Power BI” book – the only Power BI book that it’s updated every year to keep it up with the ever-changing world of Power BI and the Microsoft Data Platform! The book is making slowly its way to the retailers and it should be available on Amazon in the first days of 2020.

Operational Reporting

I understand that everyone wants to do more with less and shortcuts are tempting. But ERP systems are systems of record, just like any other data source. True, they could own most of the core data that you need for analytics. But that data is normalized and stored in a format that’s not conducive for analytics. To make things worse, Dynamics doesn’t even give you direct access to its SQL Server database on your production instance. You must go through REST APIs or export data to gain access to it. And to add new tables, you must create entities in Visual Studio! Still want to build a data warehouse in Dynamics?

ERP systems typically have some reporting features, but these features typically deliver only operational reporting. Operational reporting has a narrow view concerned with “now”, such as a report that shows customers with outstanding balances as of today. For example, Dynamics comes with standard SSRS reports. You could also enable analytical workspaces that deliver reports via Power BI Embedded. These reports, however, are operational reports. By contrast, BI is mostly concerned with historical and trend analysis.

BI Axioms

In math, axioms are statements that are assumed to be correct without a proof. We need BI axioms and the list can start like this:

  • Every mid to large company shall have a centralized data repository for consolidating trusted data that is optimized for reporting. The necessity for such a repository is in a direct correlation with the number of the data sources that must be integrated (that number will increase over time) and the complexity of the data transformation. The centralized data repository is commonly referred to as a data warehouse.
  • Dimensional modeling shall be the methodology to design the data warehouse schema. I know it’s tempting to declare your ODS as a data warehouse, but highly normalized schemas are not suitable for reporting.
  • If you’re after a single version of the truth, you shall have an organizational semantic layer. Find why you need it in my “WHY SEMANTIC LAYER?” newsletter.
  • ERP systems are not a replacement for a data warehouse. Neither are data lakes and Big Data.
  • You shall have both organizational and self-service BI, and they should complement each other. If you lean too much toward organization BI, you’ll get a backlog of requirements. If you lean too much toward self-service BI, you’ll end up with fragmented “spreadmarts”, which is where you probably started.
  • Most of the BI effort shall go toward organizational BI to integrate data, improve data quality, and centralize business calculations. Tools come and go but this effort shall endure.
  • Agile and managed self-service BI shall fill in the gaps. It should provide a feedback loop to extend organizational BI with data that the entire organization can benefit from.

 


Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Gold Partner | Data Analytics

logo

Power Platform World Tour

Organized by Microsoft and Dynamic Communities, the Power Platform World Tour will take a place in Atlanta from 2/10-2/12, 2020. I’m teaching Power BI Dashboard in a Day (DIAD) on Feb 10 for a full day. Although this is a paid event ($599), you should get a great business value as the audience will probably be smaller and I’ll be able to provide more personal attention. Then, I’ll present “Bridge Analytics and Developer Worlds with Power Platform” on Feb 12 and show how Power BI can integrate with Power Apps to allow you to change the data behind a report.

Power BI Incremental Refresh

Power BI incremental refresh (a Power BI Premium feature) refreshes a subset of a table with imported data. The main goal is to reduce the refresh time so that new data becomes available online faster. Patrick LeBlanc has a great video about how to make the incremental refresh even more incremental by using the “Detect data changes” feature and he explains in detail how it works.

What if you want to fully refresh the dataset set up for incremental refresh? For example, you configure a table for incremental refresh periodically, but you want to fully process the dataset nightly, such as to pick the latest changes to dimensions. Currently, the only option to fully refresh the dataset with an incremental refresh policy is to republish the dataset and refresh it (this works because the first refresh is always full). When the XMLA endpoint becomes writeable, you’ll have the option to do so in an XMLA script. For example, the following script fully refreshes the InternetSales table without applying the refresh policy settings. Notice also the effectiveDate setting that allows you to overwrite the current date for testing purposes.

{

“refresh”: {

“type”: “full“,

applyRefreshPolicy“: false,

“effectiveDate”: “10/24/2019”,

“objects”: [

{

“database”: “AdventureWorks”,

“table”: “InternetSales”

} ] } }

BI Axioms

A few months ago, I did an assessment for a large company that was advised by an undisclosed source that they should use their Dynamics Financials and Operations (F&O) system as a data warehouse. Recently, I came across a similar wish to use SAP as a data warehouse. I understand that people want to do more with less and shortcuts are tempting. But ERP systems can’t fulfill this purpose, and neither can other systems of record. True, these systems might have analytical features, but these features typically deliver only operational reporting. Operational reporting has a narrow view concerned with “now”, such as a report that shows customers with outstanding balances as of today. By contrast, BI is mostly concerned with historical and trend analysis.

In math, axioms are statements that are assumed to be correct without a proof. We need BI axioms and the list can start like this:

  • Every mid to large company shall have a centralized data repository for consolidating trusted data that is optimized for reporting. The necessary for such a repository is in a direct proportion with the number of the data sources that must be integrated (that number will increase over time) and the complexity of the data transformation. The centralized data repository is commonly referred to as a data warehouse.
  • Dimensional modeling shall be the methodology to design the data warehouse schema. I know it’s tempting to declare your ODS as a data warehouse, but highly normalized schemas are not suitable for reporting.
  • If you’re after a single version of the truth, you shall have an organizational semantic layer.
  • ERP systems are not a replacement for a data warehouse. Neither are data lakes and Big Data.
  • You shall have both organizational and self-service BI, and they should complement each other. If you lean too much toward organization BI, you’ll get a backlog of requirements. If you lean too much toward self-service BI, you’ll end up with fragmented “spreadmarts”, which is where you probably started.
  • Most of the BI effort should go toward organizational BI to integrate data, improve data quality, and centralize business calculations. Tools come and go but this effort shall endure.
  • Agile and managed self-service BI shall fill in the gaps. It should provide a feedback loop to extend organizational BI with data that the entire organization can benefit from.

Tips for Extended Events

Load testing and troubleshooting Analysis Services often requires capturing a query trace. The lightweight option to do so is to create an Extended Events (xEvents) session. Let’s say you want to capture all query traffic for 24 hours. You might opt to use the SQL Server Profiler, but it’s implemented as a desktop app (there must be an active Windows session but what happens if the Profiler crashes or corporate policy logs you out?) and it may impact the performance of your production server. The recommend way is to set up an xEvents session that logs the required events (same events you see in SQL Server Profiler) to a *.xel file.

What’s not so obvious is how to analyze the file. The easiest way is to open the .xel file in SQL Server Management Studio (SSMS). You’ll see a new Extended Events menu added to the menu bar. Among other things, this menu allows you to export the trace to a SQL Server table!

120919_0020_TipsforExte1.png

What makes an xEvents session even more useful is that it allows you to correlate events. For example, for some obscure reason the Analysis Services PropertiesList info is only available in the QueryBegin event. However, capturing the QueryEnd event is good enough because it gives you all the info you need, such as the query duration, query statement, database name, etc. But what if queries come from a Power BI report via a gateway to an on-prem SSAS instance? In this case, the user identity is in the PropertiesList info because the gateway uses a trusted account to connect to SSAS. So, now you must capture QueryBegin if you want to know who sent the query. Fortunately, you can use the RequestID GUID column in the trace to correlate QueryBegin with QueryEnd to look up the username from the QueryBegin row and to add it to the QueryEnd event.

Atlanta MS BI and Power BI Group Meeting on December 2nd

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on November 4, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Stacey Jones will present Power BI options with Python. Accelebrate will sponsor the meeting. And I will share some tips demoing the latest Power BI Desktop features, such as the new ribbon, decomposition tree and AI integration. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Integrating Power BI with Python
Date:December 2nd, 2019
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Python is well suited for Data Science and big data professionals. It has been voted as the most popular programming language in 2019. Microsoft made big investments in open-source R and Python, especially to extend Power BI. Join this session to learn how you can integrate Python with Power BI. Learn how to use Python in these ways:

·       Use Python as a data source

·       Transform data

·       Produce beautiful visualizations

Speaker:Stacey Jones specializes in mentoring and guiding firms in their efforts to build a modern Data, AI & BI governance programs that empower their business with Self-Service BI and Data Science capabilities. He currently serves as the Principal Data Solutions Architect at the Atlanta Microsoft Technology Center (MTC).
Sponsor:Don’t settle for “one size fits all” training. Choose Accelebrate, and receive hands-on, engaging training precisely tailored to your goals and audience! Our on-site training offerings range from ASP.NET training and SharePoint training to courses on ColdFusion, Java™, C#, VB.NET, SQL Server, and more. Accelebrate.com
Prototypes with Pizza“New ribbon, decomposition tree, and AI integration in Power BI Desktop” by Teo Lachev

PowerBILogo