Great Atlanta MS BI and Power BI Group Meeting Last Night

A good turnout last night for our Atlanta MS BI and Power BI Group July meeting. Some 60+ attended the meeting at the MTC office at Avalon. Kudos to Scott Fairbanks for the great presentation and CCG for sponsoring the meeting. Thanks to Microsoft for hosting us. We’ll make the slides available shortly at http://atlantabi.pass.org/Resources.aspx.

Power BI News from the Microsoft Business Applications Summit

If you haven’t done so, watch the “Power BI and the Future for Modern and Enterprise BI” presentation from the Business Applications Summit (July 22-24) to find where Microsoft is bringing Power BI in the next few months.

Power BI Service Scale and Adoption

  • 19M data models hosted
  • 58M monthly publish to web views
  • 8M+ report and dashboard queries/hour
  • Power BI a top 10 fastest growing skill according to Upwork

Microsoft will extend Power BI by investing in three main areas:

  • Unified platform for both self-service and enterprise BI
  • Intuitive experience that drive a data culture
  • Pervasive Artificial Intelligence for BI

072818_1802_PowerBINews1.png

Here are most important forthcoming near-future features across these three areas:

Unified platform for both self-service and enterprise BI

  • Make Power BI Desktop a better tool for larger models — new diagram view with perspectives, multi-select fields to set common properties (it’s about time to have this feature), display folders as in SSAS
  • Composite models (delivered in July 2018 release) – I covered this killer feature in my blog “Power BI Composite Models: The Good, The Bad, The Ugly
  • Aggregations – Ability to define aggregations as we can do in SSAS MD. The demo showed an impressive response time for summarized views on top of a 1 trillion row dataset.
  • Multi-geo deployments

Intuitive experience that drive data culture

  • Certified datasets
  • Application lifecycle managements (DEV, QA, PROD environments)
  • Dataflows – dataflows refer to what was announced as “datapools” and Common Data Model for Analytics (review my comments here).
  • Modernized Visualizations pane (delivered in July 2018 release)
  • Report wallpaper (delivered in July 2018 release)
  • Out of box themes
  • Visual headers (delivered in July 2018 release)
  • Snap visual with other visuals
  • Expression-based formatting for every property in the Visualizations pane – This is huge and my users would rejoice to have more control over report properties!
  • Excel-like pivot tables
  • Export to PDF – most voted feature will finally be delivered

Pervasive Artificial Intelligence for BI

  • Quick Insights on right-click
  • Natural questions with suggestions
  • Quick insights inside natural questions results
  • Integration with Python – Connector and custom visuals like R
  • Sentiment analysis
  • Integration with Microsoft Cognitive Services – The demo showed the model deducing that the images of air conditioners in a hotel are broken
  • Better integration with Azure Machine Learning Studio

An impressive list of features indeed!

Power BI Composite Models: The Good, The Bad, The Ugly

The July 2018 preview of Power BI Desktop delivers two killer preview features that solidify the Power BI position as the best data modeling tool for personal BI on the market. First, Microsoft relaxes the Power BI relationship limitations by letting you create M:M joins between two tables. Second, you can now create a composite (hybrid) data model with different storage modes, such as from an SQL Server database configured for DirectQuery and from an imported text file. Without reiterating the documentation, here are some important notes to keep in mind based on my pre-release testing.

The Good

M:M relationships remove a limitation that Power BI had since its Power Pivot days which is that one of the tables (the dimension or lookup table) must have a primary key: a field that uniquely identifies every row in the table. Now you can create a M:M relationship without requiring a bridge table, such as between two fact tables, as the screenshot shows. In this case, the Sales[State]-CityData[State] relationship is M:M because CA appears multiple times in each table. Previously, if you have attempted to create this relationship you would get an error message that the State field on the one side of the relationship is not unique.

070918_1104_PowerBIComp1.png

The new M:M feature paves the way for supporting more flexible schemas but bear in mind that other relationship limitations, such as redundant paths and close-loop relationship are still not allowed. It also doesn’t mean that you shouldn’t follow best data modeling practices, such as star schemas.

For example, Power BI would disallow an active M:M relationship between ResellerSales and InternetSales fact tables if they’re joined to the same (conformed) dimensions. That’s because as it stands Power BI still doesn’t have a user interface to prompt the end user which path to take so it deactivates the relationship when it discovers a conflict.

Perhaps, a more important enhancement is ability to create data models with mixed storage modes which are referred to as “composite” models. Just two weeks ago I taught a Power BI class and the client disappointed that they can’t join SAP HANA and SQL Server in DirectQuery mode within the same Power BI model. That’s because previously DirectQuery was limited to single database. Now you can!

070918_1104_PowerBIComp2.png

Composite models enable the following data acquisition options:

  • Combine multiple databases from the same server in DirectQuery.
  • Combine multiple data sources configured for DirectQuery
  • Implement a hybrid storage model that combines a data source with imported data with another data source configured for DirectQuery.
  • Combine a DirectQuery table with imported table from the same data source, such in the case where you prefer most tables to be imported but some in DirectQuery for real-time analysis.

Of course, DirectQuery is a double-edge sword. It lets you avoid importing data and eliminate latency, but performance depends on the data source so be careful. Aggregations (slated for later this year) could mitigate DirectQuery performance issues by letting you summarize data at a higher level.

The Bad

DirectQuery limitations continue to exist, including:

  • Imported tables can’t be converted to DirectQuery  and when attempting to do so, you’ll get an error “Imported tables can’t be converted to DirectQuery”. The reverse scenario, however, is supported. If the data source is DirectQuery, you can convert some tables to import data.
  • DirectQuery can’t return more than 1 million rows. This has been a long standing DirectQuery limitation (documented here). Consider a DimCustomer table (from same or different source) that joins FactSales configured for DirectQuery and you request a report that shows sales by customer. At a certain (undisclosed by Microsoft) point it becomes inefficient to send the entire customer list to the WHERE clause of the FactSales direct query. Instead, the query would group FactSales at the Customer field used for the join, and then locally aggregate the results. However, if that query exceeds 1 million, then this would result in the query failing.
  • Only a subset of data sources, such as popular relational databases and “fast” databases, support DirectQuery. I hope Microsoft extends DirectQuery to more data sources, such as Excel files and text files.
  • You can’t pass parameters to custom SQL SELECT statement or stored procedure, such as to pass the value that the user selects in a slicer to a stored procedure configured for DirectQuery.

Composite models add a new twist to relationships. When combining two data sources in DirectQuery (or mixed) storage mode, Power BI converts all joins between the two data sources to be of M:M cardinality, irrespective of the actual cardinalities involved. According to Microsoft, this design was due to implementation challenges and the need to preserve performance. Although this isn’t a big issue, semantically I’d prefer the join cardinality to reflect the data cardinality. For example, if I join a DimCustomer table from SQLServerA with FactResellerSales from SQLServerB, the join cardinality should be 1:M. However, it’s not possible to configure 1:M for heterogeneous joins. Consequently, it’s not possible to validate the data is indeed 1:M and the RELATED DAX function won’t work. Microsoft believes that the cardinalities of the two end points are unimportant, the key difference between this type of relationship and the existing relationships is that it only supports cross-filtering, but not other semantics like RELATED function and blank rows from RI violation.

The Ugly

Power BI has always supported three data acquisition options: Import, DirectQuery, and Live Connection (multidimensional data sources only). You might have noticed that when you connect Power BI Desktop to Analysis Services, you’d see that the Data tab and Relationship tab are not available. That’s because Power BI morphs into a presentation layer to AS and passes through DAX queries. Unfortunately, as I explained in my blog “Power BI Feature Discrepancies for Data Acquisition“, Power BI treats DirectQuery and Live Connection data sources as second-class citizens and disables certain features.

Back to composite models, regretfully they don’t apply to multidimensional models: Analysis Services (MD and Tabular), Power BI datasets, and SAP HANA. In other words, a single Power BI desktop file is still limited to connecting to a single multidimensional database. I hope that Microsoft lifts this limitation at some point because it’s not uncommon to create a self-service model that needs to combine data from an organizational semantic model in DirectQuery and some other data source (imported or DirectQuery).

Composite models and M:M relationships are important new enhancements that bring even more flexibility to the Power BI data architecture. M:M relationships allows you to support more flexible schemas. Although old and new limitations exist, composite models lets you implement models with mixed storage modes to enable interesting data integration scenarios.

Atlanta MS BI and Power BI Group Meeting on July 30

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on Monday, July 30th at 6:30 PM at the Microsoft office in Alpharetta (our regular venue room is being renovated). If you haven’t been at the new MS office, prepare to be impressed! Scott Fairbanks from CCG will show you how to create a tailored user experience with Power BI. Your humble correspondent will demonstrate Power BI killer features that will be announced at the Microsoft Business Applications Summit tomorrow. CCG will sponsor the meeting.  For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the main page) if you’re planning to attend.

Presentation:How to Create a Tailored End User Experience with Power BI
Level: Intermediate
Date:July 30th, 2018
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009

Overview:Power BI can deliver value in a variety of ways. With its ease of use, high levels of adoption, and growing market share, Power BI is being integrated into Analytics Solutions all over the world. In this session, attendees will see the power in leveraging Workspaces and Apps within Power BI to create a packaged, custom solution tailored to meet the needs of the consumer. With the end user experience in mind, this session will discuss the development process, design considerations, and the Power BI Service deployment of a collection of customized dashboards in a single App.

Discussion Points Include:

• Power BI Desktop

• Power BI Service

• Workspaces

• Apps

• Visualizations

• Interactions

• Formatting

• Mobile View

• Row level security

• Publishing

• Consuming

Speaker:Scott Fairbanks is a consultant that has been involved in Data and Analytics for 12+ years. Scott has always been interested in computing technology, and fell naturally into the role of helping others get value out of their analytics investment as a consultant with CCG. Over the course of his career, Scott has worked with many of the various BI product suites, including IBM’s Cognos, SAP’s Webi, and Tableau. In recent years, he has been swept up in the demands of the changing analytics environment and has been diving into the Microsoft Azure and Power BI space along with so many companies making the move to the cloud.
Sponsor:CCG is an award-winning consulting firm with a passion for helping clients solve their operational and strategic challenges through innovative Business Intelligence (BI) services. We believe business intelligence is more than a technical solution; it’s an opportunity to create lasting cultural change across an organization. Our experts focus on quickly gaining a big-picture understanding of your business to develop optimized, tailored BI solutions that empower you to drive greater decision-making at the speed of your data.
Prototypes with PizzaPower BI killer features in the July 2018 release.


092417_1708_AtlantaMSBI1.png

Implementing a Default Date Filter in Tabular and Power BI

Problem: A long standing limitation in Analysis Services Tabular and Power BI is the lack of default members, such as to default a Date dimension to the last date with data. Users find this annoying, especially for time calculations which require a date context and won’t produce results until you add a date-related field to the report one way or other. To aggravate things even more, Tabular switched to a JSON schema in SQL Server 2016. The new schema is great for expressing the metadata in clear and more compact way. However, because it’s not extensiible, it precludes community tools, such as BIDS Helper (now called BI Developer Extensions) and DAX Editor, to extend Tabular features and fill in the gaps left by Microsoft.

Workaround: Currently, there is no way to default a field in Tabular and Power BI. For the most common requirement to default a date filter, consider adding a field to the Date table that flags the desired item. You can implement this in different ways. For example, in Excel you can create a named set, but this requires MDX knowledge that end users won’t have. And it won’t work in other report clients, such as Power BI Desktop. Here is an approach that is simple to implement and works everywhere.

  1. Flag the desired item using whatever way you’re comfortable with: SQL view, Power Query, or DAX. For example, if users prefer the date filter to default to yesterday and the Date table is based on a SQL view, add the following column to the view:
    CASE WHEN [Date] <= DATEADD(DAY,-1, CAST(GETDATE() AS DATE)) THEN ‘Yes’ ELSE ‘No’ END YesterdayFlag
    Notice that the flag qualifies all rows prior to and including yesterday. In your first attempt, you might try only an equal condition. This will work when the YesterdayFlag field is added as a filter to an Excel pivot table. However, if you attempt to configure it as an Excel slicer, it will result in the following error:
    Calculation error in measure ‘<your measure name>’: Function ‘SAMEPERIODLASTYEAR’ only works with contiguous date selections.
  2. Reimport the Date table in Tabular or Power BI.
  3. Set up a report filter. For example, in Excel define a slicer using the YesterdayFlag field and default it to Yes, if you want to filter all reports on the same sheet.

    The net effect is that as time progresses, the YesterdayFlag field will filter the Date table and measures will show the yesterday’s data by default. Time calculations will also work as of yesterday. For example, YTD calculations will aggregate from the beginning of the year until yesterday. And end users don’t need to bother about resetting date filters if viewing data as yesterday is the prevailing preference.

Determining Power BI Relationships

Problem: Consider the following Power BI report that shows the sum of ResellerSales[SalesAmount] by SalesTerritory[SalesTerritoryCountry], Employee[FullName], and ResellerSales[SalesOrderNumber].

071618_2120_Determining1.png

This report is based on the following schema consisting of three tables.

071618_2120_Determining2.png

However, If there isn’t a direct relationship between ResellerSales and Employee, the moment you add an unsummarized field from the second table on the many side, such as Employee[FullName] after adding SalesTerritoryCountry and ResellerSales[SalesOrderNumber), you’ll get the error “Error: Can’t determine relationships between the fields”.

Solution: Interestingly, the report works fine if a summarized field, such as COUNT(Employee[EmployeeKey]) is used. In this case, the SalesTerritory dimension acts as a conformed dimension joined to two fact tables. The reason why it doesn’t work when Employee[FullName] is added is because there is no aggregation on the Employee table and the relationship between ResellerSales[SalesOrderNumber] and Employee[FullName] becomes Many:Many over SalesTerritory which is now a bridge table. One employee may be associated with multiple sales and a sale can be associated with multiple employees. How do we solve this horrible problem? There are at least two ways:

  1. If possible, create a direct relationship between ResellerSales and Employee. Now you have two dimensions joining a fact table and there is no ambiguity.
  2. If the only join option is to relate Employee to SalesTerritory, set the relationship’s cross filter direction to Both. This will indirectly filter the ResellerSales table and might achieve the same result. For each employee on the report, Power BI travels the relationship to find the related territories at the lowest granularity. In Adventure Works, the lowest granularity of the SalesTerritory table is the territory region, such as South East, so the cross filter set to Both will select all regions that are associated to the current employee. From there, it will get the sales orders and sales associated with these regions.

Where the report using the second option differs from the first is if there are multiple employees associated with the same region causing the sales order numbers to repeat although the report total is the same.

071618_2120_Determining3.png

Power BI supports flexible schemas but how you construct relationships may impact the report results. You should always start with a simple schema and grow in complexity if needed.

Download the sample

MVP For 15 Years!

Got awarded for Microsoft Most Valuable Professional (MVP) – Data Platform again. This will make my 15th consecutive year to be recognized by Microsoft for my expertise and contributions to the community! Learn more about the MVP program at https://mvp.microsoft.com/en-us/Overview.

MVP_Logo_Horizontal_Secondary_Blue288_CMYK_72ppi