MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on Monday, March 26th at 6:30 PM. Leo Furlong will introduce you to implementing organizational semantic models with Azure Analysis Services. BlueGranite will sponsor the meeting. And your humble correspondent will demo several latest Power BI features: persistent filters, tooltip pages, and share to anyone. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the group page) if you’re planning to attend.
Presentation:
Introduction to Azure Analysis Services
Date:
March 26, 2018
Time
6:30 – 8:30 PM ET
Place:
South Terraces Building (Auditorium Room)
115 Perimeter Center Place
Atlanta, GA 30346
Overview:
Azure Analysis Services (AAS) is an exciting tool within the Azure Data Services tool set. In this session, we’ll learn about the basics of AAS including how to create the service in the Azure portal, options for developing solutions and deploying them, processing data, scaling up and out, security, pricing, and where AAS fits in an overall modern data architecture. At the end of this session, you’ll have a basic understanding of how to get started with the tool and know how it could fit in your organization.
Speaker:
Leo Furlong is an experienced Data and Analytics Principal at BlueGranite with extensive experience implementing Digital Transformation Strategy using on-premise and Azure cloud technologies.
Sponsor:
At BlueGranite we help our clients utilize data as a strategic asset, delivering data architecture and data integration solutions to drive insights and analytics across your organization. Our solutions monitor trends, measure performance, and provide real-time analytics for rapid, informed decision making. We leverage industry leading technology from Microsoft to build modern data platforms that reduce operating costs, increase market share, improve workforce efficiency, and so much more.
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2018-03-21 09:16:392021-02-17 01:01:19Atlanta MS BI and Power BI Group Meeting on March 26th
I’m doing an assessment of a data mart and semantic layer for an organizational BI solution. I was given the source of the Analysis Services Tabular model. Upon opening it locally, it took Visual Studio several minutes to open the project. Even worse, after the project was finally loaded, my laptop CPU utilization went into 80-90% and all the 16 GB of memory got exhausted rather quickly. With an empty workspace database! What’s going on?
As it turned out, a business requirement asked for all measures to be in the same table. So, developers introduced a Metrics DAX calculated table and assign it as a home table for all measures organized in display folders. Nothing wrong with that. However, the number of DAX measures were 2,774! It’s common practice to “flatten” DAX measures, such as to have separate measure for time calculations: SalesAmount, SalesAmountQTD, SalesAmountYTD, and so on. This can easily result in a measure explosion, but I haven’t experienced such a performance hit before with other clients.
As it turned out, each measure requires a data structure even with no data. When you select a table in Visual Studio, Visual Studio sends a DAX query asking for the value of each measure in the table: EVALUATE CALCULATETABLE(ROW, “Measure1”, [Measure1], “Measure2”, [Measure2]… It does so because the Measure Grid shows not only the measure formulas but also their values. Currently, there is no way to suppress this query, not even if you hide the Measure Grid. When the server that hosts the workspace database receives the query, it allocates a data structure for each measure. When the query asks for thousands of measures, the server would allocate a lot of CPU and memory resources just to evaluate the query. In this case, the server would exhaust all the memory on my laptop and return an “Out of memory” exception to the client. Visual Studio doesn’t give up and switches then to evaluating each measure separately by sending separate EVALUATE CALCULATETABLE query measure by measure! Once the project is finally loaded, design performance is bad, especially in the Diagram view. Every task, even selecting a table or moving the Visual Studio window from one monitor to another, results in a CPU utilization spike and takes a few seconds until UI becomes responsive.
I asked Microsoft to introduce a switch to suppress the EVALUATE CALCULATETABLE query, such as when the Measure Grid is disabled. I personally don’t care at all about measures values in the grid and I don’t care about the measure grid at all for that matter. These values are meaningless in most cases anyway, such as when they show blank values for time calculations since there is no date context at design time.
While Microsoft is investigating and working on improving the SSDT performance, you can mitigate the performance issue with many measures by doing the following:
If possible, assign measures to multiple home tables.
Switch your project to Manual Calculation Mode (Model menu, Calculation Options, Manual Calculation). Note that manual calculation won’t process calculated columns and relationship automatically. You’d have to switch to automatic calculation or explicitly process the model with Process Default when you make design changes.
Create perspectives to reduce the number of tables in Diagram view.
With hundreds of vendors out there, choosing a BI tool isn’t easy. If you’ve been following me for a while, you know that I love Power BI. One of the Power BI most prominent strengths is that it’s a part of the much broader Microsoft Data Platform that spans various on-premises and cloud offerings for data storage, processing, and analysis. Thanks to this integration, you can implement real-time dashboards, extend your Power BI reports with predictive capabilities, integrate with Big Data, and much more. In this newsletter, I’ll talk about an exciting integration option: extending Power BI with business applications, thanks to its integration with PowerApps and Flow. And if you happen to be in the Microsoft office in Alpharetta, feel free to ask for a demo of the Power BI Writeback that Prologika prepared and deployed to the Microsoft environment. You’ll be directed to one of the big screens when you can test the report that I mention in this newsletter.
What’s PowerApps?
Every organization requires some sort of custom data-driven apps. And every organization is under pressure to do more with less. Have you used Microsoft Access? If so, you probably remember that Access makes it easy to create data-driven forms. But they were available only in Access and could access only Access database. You might also remember InfoPath – an app for designing, distributing, filling and submitting electronic forms containing structured data which Microsoft discontinued in 2014. Think of PowerApps as the evaluation and replacement of InfoPath. PowerApps is an Azure-hosted service that lets you build data-driven business apps that run in a browser or on mobile devices. Unlike Access and InfoPath, PowerApps supports many cloud or on-premises data sources. Connectivity to on-premises data sources requires installed a data gateway, which is the same software that Power BI uses to tunnel into your corporate network.
Targeting power users and developers, PowerApps requires minimum code and it has its own expression language. In the process of building an app, you specify where you want your data to come from. Then, you choose pre-defined layouts for your forms and link them together. You can develop the app online or by downloading a Windows-based desktop app. Developing the app is free. Sharing the app requires a plan, but if use Office 365, chances are that your O365 business plan already includes PowerApps.
What’s Flow?
As I explained in my “Going with the Flow” blog, Microsoft Flow is another cloud service for creating automation flows without writing code, similar to Zapier’s “zaps” and IFTTT’s “recipes”. Like PowerApps, Flow targets business users and developers willing to create basic multistep flows for automating tasks. For example, you might want to start a workflow when an email is received from specific recipients or containing specific words. Microsoft Flow makes it possible to automate this scenario and many, many more. Business users can connect their accounts, take advantage of pre-built templates, and create their own flows in an easy visual designer. And your PowerApps app can start a flow thanks to the fact that PowerApps and Flow work together.
How Can PowerApps and Flow Enrich BI?
After the integration with Visio, Power BI just took another leap forward with the introduction of the PowerApps custom visual. You can use this visual in your Power BI reports to bridge the Data Analytics and Developer worlds. Now your Power BI apps can integrate with Line of Business (LOB) applications in versatile ways and the app can trigger workflows. All this with minimum code! Suppose you have a Power BI report that shows sales by customers. As the user browses the data, he realizes that some changes need to be made. Granted, the user can open the appropriate business app and make the changes there, and then go back to the report to see the changes. But read-only reports and are so 20th century. Instead, why can’t we allow the user to make the changes on the report by integrating it with PowerApps? This scenario is commonly referred to as writeback.
This is exactly the approach my Customer Writeback app demonstrates. The user selects a customer in the table and the customer details show up in the Change Customer visual to the right. This is the PowerApps visual that references my Customer Writeback PowerApps app. The user can make changes in the PowerApps form and save the changes back to the underlying database. Then the user can refresh the report to see these changes. Bringing this one step further, the app can start a workflow, such as when a formal approval is required to approve the changes.
If you have experience with PowerApps, implementing this scenario takes minutes. You can find the implementation steps and source code in my “Power BI Writeback” blog.
The Power BI integration with PowerApps and Flow opens exiting new possibilities and redefines the meaning of a report. These three technologies let you integrate your reports with “smart” applications that you can implement with almost no code! Featuring more than 100 data sources, you can use PowerApps and Flow as an integration hub to mash data from almost any place, create workflows, and then embed PowerApps forms in Power BI reports.
Teo Lachev Prologika, LLC | Making Sense of Data Microsoft Partner | Gold Data Analytics
Scenario: You use Azure Analysis Services and you want to trace events in the SQL Server Profiler. However, you run the Profiler, connect to AAS, and you get greeted with “Unable to locate trace definition file Microsoft Analysis Services TraceDefinition 15.0.0.xml“. The prompt gives you an option to download the trace definition from the server, but nothing happens. You’ll encounter the same issue if you attempt to access a higher version of AAS or Database Engine from a lower version of Profiler, such as when you install SSMS 2012 on your laptop, but you connect to SQL Server 2016 or higher.
Workaround: While we’re waiting for Microsoft to provide the Azure AS trace definition, you can use this workaround:
Navigate to the Profiler trace definition folder. In my case this folder is C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Profiler\TraceDefinitions. Copy the latest trace definition file (in my case it was “Microsoft Analysis Services TraceDefinition 14.0.800.xml”) and paste it as Microsoft Analysis Services TraceDefinition 15.0.0.xml. You need admin privileges to do this.
Start Notepad or Notepad++ with admin rights and open Microsoft Analysis Services TraceDefinition 15.0.0.xml from that folder.
Change the Major version to 15, minor version to 0, and build number to 0 at the top of the file and save the file.
If you want the default trace templates so you don’t have to configure the trace from scratch, navigate to the Profiler’s Templates folder (C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Profiler\Templates\Microsoft Analysis Services). Copy the 140 folder and paste it as 150 folder. This folder has the Replay and Standard template definitions.
As Analysis Services evolves, the version number would increase. You can connect SSMS to AAS and check its version in the server properties. In this case, version 15.0.1.245 corresponds to 150. This why the folder name is 150.
Azure SQL Database Managed Instance is available for public preview (for pricing see this page). It will play a very important role in the SQL Server cloud PaaS derivatives, which are now four:
Azure SQL Database – A database-scoped service available in Basic, Standard, and Premium performance tiers.
Azure SQL Data Warehouse – A scalable farm of Azure SQL databases that has roots in Microsoft Analytics Platform System (APS).
Elastic Pools — A pool of Azure SQL databases that share the same resources and it’s particularly useful for multi-tenant scenarios.
(NEW) SQL Database Managed Instance – Similar to an on-premises SQL Server, SQL Database Managed Instance allows you to host multiple database under the same instance.
A customer used SQL Database Managed Instance while it was in private preview and I followed its evolution closely. Here are few notes:
The Good
The good is in the name. It’s common to split your data in multiple databases. In the BI world, we typically have a staging database and a DW database, but you can also have MDS, SSISDB database, and others. And it’s much easier and faster to reference directly objects in another database instead of exporting and staging the data. Now, instead of isolated cloud databases, SQL Database Managed Instance can host multiple databases, just like your on-premises SQL Server. This allows you to “lift” your on-premises databases and shift them to the cloud if this makes sense. Further, there is almost a full feature parity between SQL Server and SQL Database Managed Instance as far as the Database Engine is concerned. Even better, SQL Agent is here so you can schedule jobs! Although not all job types as you’ll find in a moment, sigh…
The Bad
SQL Database Managed Instance is all about the Database Engine. It doesn’t support the BI offerings – SSAS, SSRS, SSIS, MDS. SSAS cloud migration requires deploying to Azure Analysis Services. SSRS needs to go a virtual machine but the Report Server database can be hosted in the managed instance. SSIS to Azure Data Factory (ADF) or a VM (the SSISDB catalog can be hosted in the managed instance). MDS need to go to a VM (Enterprise Edition alert!). All these paths require additional deployments and additional pricing. So, if you’re on SQL Server Standard Edition on premises, don’t expect your bill to remain the same. In general, speaking of pricing, don’t expect any cloud migration to cost less. It might, or it might not. Do your homework. My experience shows that cloud deployments are for convenience and make sense when data is in the cloud, and not that much for cost cutting. No vendor will let cloud offerings cannibalize its on-premises counterparts.
The Ugly
The SSIS story and the whole Microsoft cloud ETL story for that matter deserves the ugly award. I don’t know why Microsoft added ADF given that thousands of ETL developers use SSIS. To “lift and shift” your SSIS-based ETL you need to configure ADF to host the SSIS runtime. This scenario is still in preview and it doesn’t have UI (you need to use PowerShell for the configuration part). I failed to get an answer from Microsoft what engineering predicament precludes spinning the ISServerExec process to host the SSIS runtime directly in the managed instance so that we can lift and shift SSIS jobs and leave them still running under SQL Agent. I wonder how many customers would continue deploying SSIS to a VM, bypassing ADF and its idiosyncrasies altogether.
Conclusion
Azure SQL Database Managed Instance is a great addition the SQL Server cloud family. Azure SQL Database Managed Instance will remove developer’s pain in the cloud by allowing you to host multiple databases under the same instance. However, “lift and shift” scenarios need to be carefully evaluated, especially for existing on-premises BI solutions.
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2018-03-06 18:31:402018-03-10 08:52:56Azure SQL Database Managed Instance – the Good, the Bad, the Ugly
Scenario: An interesting scenario came today from a client. Consider the following schema. You have a fact table InternetSales and two dimension tables (Customer and SalesTerritory).
You want to produce a report ranking each customer based on their sales for each territory. In addition, you want the rank to change as you filter the report, such as filter specific customers or specific territories, so you need a measure.
Solution: The DAX RANKX function takes a table as the first argument. However, in this case, the rank needs to be produced over fields from two tables. So, the CROSSJOIN function is used to produce the needed combination.
‘InternetSales'[CustomerRank] := IF ( ISBLANK([SalesAmount]) || NOT HASONEVALUE(Customer[FullName]), BLANK(), RANKX(CROSSJOIN(ALLSELECTED(‘Customer'[FullName]), ALLSELECTED(‘SalesTerritory'[SalesTerritoryCountry])), [SalesAmount],,,DENSE) )
The ALLSELECTED function is needed to ignore the context from the FullName and SalesTerritoryCountry from the respective columns on the report. The condition in the IF statement ignores calculating the rank for customers with no sales and for the report total.
If you need to ignore a specific table so it doesn’t affect ranking then you can use the CALCULATE function (the IF logic excluded for brevity):
Another way to solve this scenario is to produce the rank over the fact table but you need to add FullName and SalesTerritoryCountry to the fact table first and use the fact columns on the report (not the ones from the dimensions). Then, you can do:
Microsoft selected and published two Prologika Power BI-related consulting offers on Microsoft AppSource. And their price is right; they are both free!
Power BI for Your Organization: 2 HR Assessment – Prologika will evaluate your data analytics objectives to determine how Power BI can empower your organization to understand better and improve its business.
Power BI for Business Performance: 2-Hr Briefing – Prologika will provide you with a free, 2-hour briefing on how our client derived tremendous value from our solution for Business Performance Management. Learn how Prologika transformed the way a large organization does financial planning by creating and comparing budgets and forecasts with our customized solution based on the Microsoft Data Platform and Power BI.
The 2018 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms is out. I read the report and followed the dots. I personally hoped to see the Microsoft dot higher than the overrated Tableau, given everything I said here, here, and here. I’m glad that Gartner finally realized what I have been saying all along that Tableau lacks “complex data model support” among many other things.
Anyway, this is a great endorsement of how far Power BI has come. And about what Microsoft can do when product groups work together. Mind you, the main focus of the Gartner’s evaluation was Power BI. The organizational BI pieces (SQL Server, SSRS, SSAS, SSIS, Azure) were not considered.
Here are my thoughts about the Microsoft-related cautions cited by Gartner:
Mode 2 analytics, Azure cloud only – “Power BI has mainly focused on the requirements for Mode 2 (agile, self-service) analytics, while its on-premises SQL Server Reporting Services serves the needs of Mode 1 (for scheduled, distributed reports). For Microsoft customers, this has resulted in a two-product deployment with different capabilities and different deployment approaches. Microsoft’s reference customers cited absent or weak functionality (14%) as a limitation to wider deployment, with 27% (the highest percentage of respondents for this Magic Quadrant) citing a range of other problems.” Gartner has a point here. “Cloud first” shouldn’t mean “on premises never”. Supporting Power BI reports in Power BI Report Server (an add-on to SSRS) was a great move but customers expect to get at least the most important Power BI features, such as dashboards, Q&A, online report editing, Quick Insights, embedded APIs. But they are not in Power BI Report Server and there is no roadmap when they will arrive on premises if ever. If there is a fear that Power BI Report Server would cannibalize the cloud, then maybe we shouldn’t have Power BI Report Server at all. The interesting part is that some on-premises features, such as Excel workbooks with live connections, are not supported in the cloud. So, the feature mismatch goes both ways. I hope Microsoft addresses this as it’s a biggy for my clients.
Breadth of use – “As previously, Microsoft’s scores from its reference clients place it in the bottom quartile for breadth of use. Breadth of use looks at the percentage of users using the product for a range of BI styles: from viewing reports, creating personalized dashboards and doing simple ad hoc analysis, to performing complex queries, data preparation and using predictive models“. Not sure I agree with this one. My observations do show that business users don’t bother to learn the more advanced features, such as Power Query. But whose problem is this? Further “This pattern of use suggests that the more advanced data preparation is performed outside Power BI“, there are absolutely cases that go beyond a typical power user and such tasks are better done by IT, such as data integration in a central data repository. Do Tableau and Qlik handle advanced data preparation? Wait, Tableau doesn’t even have data integration while Qlik requires scripting.
Multiple products – “The core of Microsoft Power BI is a self-contained product. However, a number of elements in Microsoft’s roadmap that are part of the product vision span multiple products and require integration. For example, insight to action requires Microsoft Flow.” I don’t see a problem with having multiple products if they can integrate. Especially given that some are free, such as PowerApps and Flow with O365. The reason why they are separate products is that they don’t apply to BI only. And which BI vendor has flow, business apps, predictive models, in the same product? Not to mention that Tableau and Qlik don’t have anything more than just a reporting tool.
Sales experience – “The majority of Microsoft’s reference customers evaluated their sales experience as good to excellent. However, in relative terms, Microsoft is slightly below average for this Magic Quadrant.” I haven’t gotten feedback from clients about their sales experience, so I can’t comment but I thought Gartner should have elaborated more on this and beyond “relative terms”. Further, “While Microsoft Power BI is a low-priced option, clients expressed frustration about changes in the pricing and packaging and the lack of clarity about what Power BI functionality is included with Microsoft Dynamics 365.” So, Power BI pricing has remained the same and it’s clearly stated. I think what Gartner meant is that the Power BI Embedded pricing story has changed. I wonder how many customers complained about the Dynamics 365 integration.
2017 vs. 2018
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2018-02-27 20:18:512018-03-09 09:00:372018 Gartner Magic Quadrant for Analytics and BI Platforms
While we’re waiting for the Gartner 2018 Magic Quadrant for Data Analytics (should be out any moment), two others related to data analytics were released.
2018 Magic Quadrant for Data Management Solutions for Analytics
This one evaluates “complete software system that supports and manages data in one or more file management systems (usually databases)”. So, think of traditional and cloud-based data warehousing. I’m surprised to see AWS ahead of Microsoft and Oracle in the ability to execute, given that Amazon doesn’t have on-premises offerings (see the ML quadrant below where apparently that’s important).
2018 Magic Quadrant for Data Science and Machine-Learning Platforms
This one evaluates “vendors of data science and machine-learning platforms. These are software products that data scientists use to help them develop and deploy their own data science and machine-learning solutions.” I personally don’t agree with Gartner on the Microsoft Visionary rank, given all the investments Microsoft has made in machine learning.
“Microsoft remains a Visionary. Its position in this regard is attributable to low scores for market responsiveness and product viability, as Azure Machine Learning Studio’s cloud-only nature limits its usability for the many advanced analytic use cases that require an on-premises option.”
Apparently, Azure Machine Learning is not well received as it’s not easy to operationalize. That one I agree with. But “for on-premises workloads, Microsoft offers SQL Server with Machine Learning Services, which was released in September 2017 — after the cutoff date for consideration in this Magic Quadrant.” Not really. In 2016, Microsoft integrated R with SQL Server after the acquisition of Revolution R. Apparently, this wasn’t communicated well to Gartner.
I love performance optimization. It’s exciting to peek under the hood and take a deep dive in the mysterious world of how Microsoft designed things. And it always humbles me by bringing forth gaps in my knowledge.
Scenario: You have SSRS reports that query Analysis Services Multidimensional cubes. When users request reports in Report Manager (aka Web Portal starting with SSRS 2016) and … the report hangs sporadically, without even getting to the ReportViewer page… for minutes. The server doesn’t appear to be under resource pressure. The issue happens when the cube is being processed. And no, this is not about locking.
Solution: This one wasn’t trivial to figure out. The profiler showed that when the report is requested, SSRS fires the first MDX query (typically, a query that populates the parameter available values). The breakthrough came when I executed the same query in SSMS while the SSRS one was blocked, and it executed immediately. With some help from Microsoft, it turned out that flattened rowsets (SSRS requests flattened rowsets), are serialized by threads in the SSAS processing pool. Indeed, once I enabled the Serialize events in the profiler, I noticed that the Serialize Results Begin event fires but not Serialize Results End.
So, if there are not enough threads in the SSAS processing pool, the flattened queries are queued. You can use the MSAS11 Threads: Processing Pool Job Queue Length counter to confirm that processing threads are queued. In this case, the queue length was about 200 threads when fully processing the database!
The resolution depends on your specific situation. If you have plenty of CPU cores, try increasing the pool size by increasing the ThreadPool\Process\MaxThreads SSAS setting. The default value of this setting is 0, causing SSAS to allocate the maximum of (64, number of logical cores). So, if you have a two-socket server with 16 hyperthreaded cores (32 logical cores), the maximum number of processing threads would be 64. On the other hand, if you see a high CPU utilization during processing, try reducing the parallelism of the processing task by using one of these three approaches:
Reduce the maximum number of connections in the SSAS data source properties.
Set the MaxParallel setting on the processing batch command.
Reduce the CoordinatorExecutionMode SSAS configuration settings.
Also, I recommend you load test your production server to ensure that it has enough capacity. I did a “Can Your BI Solution Scale” presentation at TechEd 2011 about load testing SSAS and SSRS that shows how this could be done.
Many thanks to Akshai Mirchandani from the SSAS product team for shedding light in dark places.
https://prologika.com/wp-content/uploads/2016/01/logo.png00Prologika - Teo Lachevhttps://prologika.com/wp-content/uploads/2016/01/logo.pngPrologika - Teo Lachev2018-02-23 21:59:052018-02-23 21:59:05Unblocking MDX Queries from SSRS Reports