Applied MDS and DQS Training Class Available

In my opinion, every mid-size to large organization should have a solution for master data management. Based on my observations, data quality issues have reached a pandemic level. Ironically, putting a “lipstick on the pig” with some cool visualizations appears to have a higher priority on the BI roadmap but it’s just masking the underlying issues. The focus should be on centralizing data management and improving data quality. I know this is easier said than done and the problem is further complicated by the proliferation of systems and corporate acquisitions. I know an organization which spent $8M (kid you not) on a MDM project with one of the most prominent management consulting companies and all they got back after two years was slides and documentation. It shouldn’t be this way of course. But start somewhere…. Start small, perhaps with putting under control all of these Excel spreadsheets that are floating around in your company (aka referenced data).

And you don’t have to break the bank for software. Luckily, SQL Server has a great toolset for master data management and data quality. Speaking of which, I’m happy to announce my latest training class: Applied Master Data Services and Data Quality. This two-day class is designed to help you become proficient with SQL Server 2016 Master Data Services (MDS) and Data Quality Services (DQS). IT and business users will learn how to design MDS models and extend them with business rules, attribute groups, and hierarchies. IT will learn how to integrate MDS with upstream and downstream systems and how to enforce secured access. Business users will learn how to use Excel to manage data with the tool they love most – Excel! Then, the class teaches how to create DQS knowledge bases, perform matching and cleansing tasks, and integrate DQS with MDS and SSIS. The class can be customized to target separately pro and business user audiences.

Power BI Enterprise Content Sharing

I taught Power BI to a large organization this week. Naturally, they were interested in sharing content across multiple departments and even entire organization. Let’s revisit the sharing options available in Power BI which I summarize in the following table.

052316_0116_PowerBIEnte1.png

  • Simple sharing – This is the only option available in Power BI Free. It allows sharing real-only dashboards and underlying reports to a small audience, such as with your boss or with a few coworkers. The content author would typically invite recipients by email. Recipients can’t change the shared dashboards and reports that someone else shared with them.
  • Groups – Power BI groups (workspaces) are based on Office 365 unified groups. Anyone with a Power BI Pro subscription can create a group and add other users to the group. In the process of creating a group, the group owner specifies the group privacy level (private or public) and content access (edit or read-only). Groups are great to share content with members of a team, such as Sales, Finance, or a project. Public groups might seem like a convenient way to share content across the entire organization but they aren’t really “public” because a user still needs to add himself to the group. Once the group is created (give it some 30 minutes to become discoverable), the user can find the group in Outlook and add himself as a member of the group. So, public groups are public to the extent of allowing end users to gain membership without an administrator formally approving them. However, to share content across the entire organization, you probably won’t want to add every user to the group, especially considering that Power BI groups don’t currently support membership via AD groups or distribution lists.
  • Organizational content pack – Similar to service content packs, organizational content packs allow the author to package and distribute content to any user or group (distribution and AD groups are supported) who’s interested in the content pack. Organizational content packs are discoverable in the Power BI Content Pack Library (in powerbi.com, click Get Data ð Organization). Content packs are designed for broader content delivery. By default, the content pack is read-only but users can create personal write copies of packaged dashboards and reports. So content packs are flexible.

To recap, if you need to distribute content across multiple departments or entire organization, organizational content packs should be the way to go.

Atlanta MS BI Group Meeting on May 23th

MS BI fans, join me for the next Atlanta MS BI Group meeting on May 23th and 6:30 PM. I’ll present the self-service data shaping and transformation capabilities in Power BI. Then Alteryx, who is also sponsoring the event, will show us how to bring Power BI data transformation and analytics even further with their products. I’ll also showcase two interesting Power BI new features: narratives and row-level security.

Rate this meeting http://aka.ms/PUGSurvey, PUG ID: 104
Presentation: Self-service Data Preparation for Data Analysts
Level: Intermediate
Date: Monday, May 23th, 2016
Time 6:30 – 8:30 PM ET
Place: South Terraces Building (Auditorium Room)

115 Perimeter Center Place

Atlanta, GA 30346

Overview: Power BI has a lot to offer to data analysts interested in creating data models using Power BI Desktop or Excel. However, unless you analyze data stored in a data warehouse, data is rarely clean. Fortunately, Power BI Desktop and Excel include powerful data shaping and transformation capabilities that help the data analyst improve the data quality before importing the data in the model. In the first part of this session, Teo Lachev will show you how to use the Power BI built-in features to apply various data transformation tasks. In the second part, you’ll see how Alteryx can bring Power BI data transformation and analytics even further with:

  • Data Preparation: Quickly prep, blend, and analyze data from multiple sources for Power BI consumption. Deliver a repeatable workflow process to reduce the time spent on manual processes. Push datasets directly to Power BI to create rich visualizations.
  • Advanced Analytics: Utilize the power of spatial and predictive analytics in a simple drag-and-drop interface without having to write code. Add a new dimension of insight to your visualizations with advanced modeling tools to handle complex use cases such as Multi-Channel Analysis, Predictive A/B Testing, Location Selection, Market Basket Analysis, and more.
  • Text Analytics: Build repeatable analytic workflows and harness the power of Azure Machine Learning to easily perform sentiment analysis and extract key phrases from your data.
Speaker: Teo Lachev is a consultant, author, and mentor, with a focus on Microsoft Business Intelligence. Through his Atlanta-based company “Prologika”, a Microsoft Gold Partner in Data Analytics, he designs and implements innovative solutions that unlock the power of data and bring tremendous value to his customers, ranging from small companies to Fortune 50 organizations. Teo has authored and co-authored several SQL Server BI books and he has been leading the Atlanta Microsoft Business Intelligence group since he founded it in 2010. Microsoft has recognized Teo’s expertise and contributions to the technical community by awarding him the prestigious Microsoft Most Valuable Professional (MVP) award since 2004.
Sponsor: Alteryx provides business users with the unique ability to easily prep, blend and analyze all of their data using a repeatable workflow, then deploy and share analytics at scale for deeper insights in hours, not weeks. nalysts can connect to and cleanse data from data warehouses, cloud applications, spreadsheets and other sources, easily join this data together, then perform analytics – predictive, statistical and spatial – using the same intuitive user interface, without writing any code.
Prototypes with pizza Narratives and Row-level Security with Teo Lachev
Take a glimpse of how you can secure Power BI Desktop models with Row-level security

Presentation for SQL Saturday Atlanta

SQL Saturday Atlanta will take place on May 21st. Your humble correspondent will present “What’s New for BI in SQL Server 2016” scheduled to start at 8:15 AM. Join me for an hour of coverage of my favorite SQL Server 2016 new BI features. Get your questions answered.

SQL Server 2016 delivers many data analytics and information management new features and enhancements. Join this session to discover what’s new for BI in Database Engine, SSRS, SSIS, SSAS, and MDS. I’ll also explain the SQL Server 2016+ near-future roadmap that will help you implement new solutions and cut cost.

Power BI vs. Tableau (Part 2)

In a previous blog, I reviewed the claims that Tableau made in their “10 Ways Power BI Falls Short” presentation. To be fair to Power BI, in this blog I’ll list 10 areas where I believe Tableau falls short compared to Power BI.

Overall

In this section, I’ll review some general implementation and cost considerations that in my opinion make Power BI more compelling choice than Tableau.

  1. Data Platform – No matter how good it is, a self-service visualization tool addresses only a small subset of data analytics needs. By contrast, Power BI is a part of the Microsoft Data Platform that allows you to implement versatile solutions and use Power BI as a presentation layer. Want to implement a real-time dashboard from data streams? Azure Stream Analytics and IoT integrates Power BI. What to show reports on the desktop from natural questions? Cortana lets you do it by typing questions or voice. Want to implement smart reports with predicted results? Power BI can integrate with Azure Machine Learning? Want to publish SSRS and Excel reports? Power BI supports this. Expect this strength to increase as Cortana Analytics Suite and prepackaged solutions evolve.
  2. Cloud First – I know that many of you might disagree here as on-premises data analytics is currently more common, but I see the cloud nature of Power BI as an advantage because allows Microsoft to push out new features much faster than the Tableau 2-year major release cadence. Recall that Power BI Service is updated on a weekly basis while Power BI Desktop is on a monthly release cadence. And because Power BI is a cloud service, it supports the versatile integration scenarios I mentioned before.
  3. Cost –Tableau states that it’s “among the most affordable business analytics platforms” but it’s hard to compete with free. Power BI Desktop is free, Power BI Mobile apps are free, Power BI Service is mostly free. If you need the Power BI Pro features, Power BI is packaged with the Office 365 E5 plan, it has an enterprise license, and I’ve heard customers get further discounts from Microsoft.

Next, I’ll review specific Power BI strengths for different user types.

Business Users

By “business users”, I’ll mean information workers that don’t have the skills or desire to create data models.

  1. Content packs and Get Data – Basic data analytics needs can be met in Power BI without modeling. For example, if the user is interested in analyzing data from Salesforce, the user can use the Salesforce content pack and get predefined reports and dashboards. Further, the user can create their own reports from the dataset included in the content pack.
  2. Productivity features – Power BI has several features that resonate very well with business users. Q&A allows users to ask natural questions, such as “sales last year by country”. Power BI interprets the question and shows the most suitable visualization which the user can change if needed. Within 20 seconds, Quick Insights applies machine learning algorithms that help business users perform root cause analysis and to get insights that aren’t easily discernible by slicing and dicing, such as to find why profit is significantly lower in December. Such productivity features are missing in Tableau.

Data Analysts

Data analysts (power users) are the primary audience for self-service BI. Power BI excels in the following areas:

  1. Data shaping and transformations – Source data is rarely clean. Excel Power Query and Power BI Desktop queries allow the data analysts to perform a variety of basic and advanced data transformations. Tableau assumes that data is clean or relies on tools from other vendors.
  2. Sophisticated data models – Power BI offers much more advanced modeling experience where a data analyst can build a self-service model on a par with semantic models implemented by BI pros. For example, the model can have multiple fact tables and conformed dimensions. Power BI supports one-to-many and many-to-many relationships. For the most part, Tableau is limited to a single dataset with limited “data blending” capabilities if you want to join to another dataset.
  3. Powerful programming language – Tableau’s formula language is nowhere near the Data Analysis Expressions (DAX).

BI and IT Pros

Besides the ability to integrate Power BI to implement synergistic solutions, pros can build hybrid solutions:

  1. Hybrid solutions – Want to get the best of both worlds: always on the latest visuals while leaving data on premises? Power BI lets you connect to your data on premises.

Developers

Developers has much to gain from the Power BI open extensible architecture.

  1. Extensibility – Power BI allow developers implement custom visuals which can be optionally shared to Power BI Visuals Gallery. By contrast, Tableau visuals are proprietary and not extensible. In general, the Power BI API are richer than Tableau’s. Power BI let developers push data into datasets for real-time dashboards and manipulate deployed objects programatically. Power BI Embedded, currently in preview, allows developers to embed interactive reports without requiring installation of tools and with very attractive licensing model.

Summary

To recap, Tableau is a fine tool but it will be increasingly harder for it to compete against Power BI.

Expect one or more of the following to happen to Tableau and other vendors within a year:

  1. Reduce drastically licensing cost
  2. Get acquired by a mega vendor
  3. Lose a significant market share

I often hear that Power BI is great but it’s not mature. In my opinion, you should view Power BI to be as mature as Tableau and other tools. That’s because the Power BI building blocks have been around for many years, including xVelocity (the in-memory data engine where imported data is stored), Power Query, Power Pivot, Power View, Tabular, and Azure cloud infrastructure.

I’d love to hear your feedback on Tableau vs. Power BI.

Implementing Lookups in Power Query

Scenario: Suppose you have two source tables coming from different data sources, such as Excel or text files, as shown in in the screenshot below. The Master table has a DateStart and DateEnd columns that denotes a date range for ID. The Details table has some date (and possibly time) values. You need to look up the ID column from the Master table where the Details.Date column falls between the DateStart and DateEnd so that you can create a relationship Details.ID ð Master.ID. I probably make this more complex because of a real-life scenario where this requirement popped up. But the same approach can be applied when you need to implement simpler lookups in Power Query.

pqlookup

Solution: There are several options to implement this requirement:

  1. If tables come from a relational database, implement the lookup in the source query for the Details tale. This should be the preferred option for performance reasons.
  2. If the lookup requires only an equality condition on one or more columns, you can add a calculated column to the Details table and use the DAX LookupValue function. However, remember that xVelocity calculated columns are not compressed and they should be avoided for large fact tables.
  3. If the lookup requires inequality conditions, you can use add a calculated column that has a more convoluted DAX expression, such as the one below:

  ID = CALCULATE(MIN(Master[ID]), FILTER(Master, Master[DateStart]<=Details[Date] && Master[DateEnd] >= Details[Date]))
However, when you try to create a relationship, you’ll get a “circular dependency” error so you might need to copy the table.

  1. Implement the lookup in Power Query before tables are imported in the data model. Power Query doesn’t currently support a lookup transformation but the following function gets the job done:

    let

    GetKey = (SourceDate as date) =>

    let

    Source = Table.SelectRows(Master, each [DateStart]<=SourceDate and SourceDate<=[DateEnd]),

    Key = if Table.IsEmpty (Source) then null else Source[ID]{0}

    in

    Key

    in

    GetKey

In this case, the function takes a Date argument. When the function is invoked for each row in the Details table, the query will pass the value of the Date column to this argument. Then, Table.SelectRows filters the Master table using the inequality condition. If the filtered table is empty, the function returns null so that it doesn’t error out when attempting to get to the ID value. Otherwise, the function returns the first value in the ID column, assuming that there will be at most a single row matched. The attached sample demonstrates the solution modeled in Excel. The PQLookupData file simulates the source data, while the PQLookup file has the data model and queries.

Download Files

Rule-based User Names for Power BI Hybrid Solutions

We have a great new enhancement to the Power BI Gateway – Enterprise that just came out! Many organizations are planning hybrid Power BI solutions where report and dashboard definitions are deployed to Power BI but remain in SSAS Multidimensional or Tabular semantic models on premises. At the same time, many organizations have run into the dreaded Windows security issues when the user principle name (UPN) is in a different domain than the domain where SSAS is installed. Or the domains might not have a trusted relationship. In this case, the gateway can’t delegate EffectiveUserName to SSAS and the connection fails because it comes as an anonymous user. As a workaround, you can assign aliases in active directory but for many organizations AD changes are simply not an option.

To circumvent this issue, the latest release of the gateway now allows rules. Suppose that I log in to Power BI with user1@prologika.com but the SSAS server is installed on the acme domain. Now, the gateway administrator can set up a rule on the gateway properties that replaces “prologika” with “acme” (make sure to click the Add button to add the rule). Then, you can test the rule with some user accounts. As you can see from the screenshot and from the announcement, Microsoft also plans to support CustomData in near future. This might be useful in the scenarios where you don’t care about the actual user UPN when applying data security in SSAS. For example, you can say that all users belonging to the acme1 and acme2 domains are “Acme Users”. Then, you can use the CustomData() function (supported in both Multidimensional and Tabular) to return the mapped string and apply whatever security rules are needed for Acme Users.

With this release, Power BI Gateway – Enterprise allows all organizations to maximize their investment in SSAS irrespective and implement hybrid Power BI solutions irrespective of their AD domain setup.

050616_2207_RulebasedUP1.png

Power BI vs. Tableau (Part 1)

As I mentioned in my blog “Why Business Like Yours Choose Power BI Over SiSense”, expect attack from other vendors to intensify as they find themselves fighting an increasingly uphill battle against Power BI. As we’ve seen, Power BI has indeed disrupted their sales cycles. In this blog, I’m reviewing the “10 Ways Power BI Falls Short” presentation that Tableau has published on their site. Tableau, of course, is a fine tool for what it’s designed to do – mainly self-service BI. But as we’ve all seen tools come and go as seasons. A few years ago it was all about Qlik, then Tableau, and now 2016 seems to be the year of Power BI. My advice has always been that the focus should be on sound strategy, data integration, and data quality, and not tools. The last thing you want is a “cool” tool that over promises but under delivers and you’re left with nothing when you decide to move on (and some of the vendors really cross the line during their sales pitch). So, we have to keep them honest!

Anyway, let’s take a look at some the claims that Tableau has made by going through their slides. This is, of course, a one-sided perspective that extolls the Tableau virtues. I plan a future post “10 Ways Tableau Falls Short” to fill in the gap. Also, Tableau should get in the habit to update this document frequently given the fact that Power BI changes weekly and some of the claims are no longer valid.

  1. Missing outliers are lost insights – Tableau has a point here. For some obscure reason, Power BI developers has decided to apply a data reduction algorithm to favor performance over details. True, this might result in lost outliers with thousands of data points. I recommend Microsoft allow end users to adjust the data reduction settings. I know this limitation is high on the wish list and I expect it to be addressed/lifted really soon. UPDATE 9/13/2017 – As of this date, all Power BI visuals except map supports rendering of all data points.
  2. Difficult to answer easy questions – This refers to the fact that currently Power BI doesn’t support auto-generating common calculations. Fair enough, Power BI doesn’t support this yet but the statement “you’ll need to learn DAX first” is somewhat overloaded. There are plenty of DAX examples online of how to implement common calculations so there isn’t that much to learn. And DAX is much more powerful than Tableau expressions. UPDATE 10/1/2016 – Power BI introduced Quick Calc with the Percentage of Total being the only one currently available. UPDATE 4/3/2017 – Better yet, Power BI introduced Quick Measures, which supports various prepackaged calculations and show the DAX formula.
  3. No trends or forecasting available – Power BI just added trendlines. For now, forecasting needs to be done either in Excel or R. I don’t know why Power BI still hasn’t picked the linear forecasting capabilities that Power BI for Office 365 had. Another feature that is very high on the wish list so I don’t expect you have to wait long for forecasting. UPDATE 10/1/2016 – Power BI added forecasting.
  4. You can’t compare several categories – If I understand this correctly, it refers to ability to drill down across multiple categories. Power BI matrix reports should get the job done. Also, Power BI has recently added the ability to drill through chart data points.
    050416_0229_PowerBIvsTa1.png
  5. Filtering is tough – Tableau is correct that Power BI doesn’t support context filtering but the statement “You’d have to take the time to filter everything around it, one-by-one, instead” is overloaded. It shouldn’t be that difficult to filter out values using visual-level, page, or report filters. It might take a few more clicks, but I won’t consider this to be a major limitation.
  6. Half the details = Half the insight. True, Power BI tooltips are not yet customizable. Should important information be in tooltips though and require hovering from one point to next? UPDATE 10/1/2016 – Tooltips are now customizable.
  7. Organizing your data is difficult – True, Power BI doesn’t support dynamic groups, e.g. by lassoing some scatter points. I personally haven’t heard users complaining or asking about it so I don’t consider it to be a major limitation. Power BI does support hierarchies. UPDATE 11/1/2016 Power BI Desktop October Release introduced dynamic groups and binning in charts.
  8. No offline iterations allowed – “In Power BI, you can do some basic web editing, but you can’t download it to your desktop or work offline” This is incorrect. First, web report editing it’s on a par with desktop editing. Second, a best practice is to create your reports in Power BI Desktop and upload to powerbi.com. If you do this, you can download the pbix file and work offline. Moreover, Tableau web editing has more limitations than Power BI.
  9. You can’t tell a story – Outdated. Power BI added a Narratives for Power BI. Coupled with Quick Insights, these features surpass the Tableau capabilities.
    050416_0229_PowerBIvsTa2.png
  10. You can’t ask what-if questions – True, Power BI doesn’t support What-If natively yet. If this is important, you can export the visual data to Excel and use the Excel what-if, goal seek, and scenario capabilities. UPDATE 9/13/2017 – August release introduced What-If parameters and analysis

Overall, I believe that some of the points Tableau makes are insignificant while Power BI has already addressed others. Agree? Stay tuned for a “10 Ways Tableau Falls Short” blog.

SQL Server 2016 To Become GA on June 1st

Microsoft announced that SQL Server 2016 will become generally available on June 1st. The BI edition is gone and I don’t think that many people would miss it because of its client-based CAL licensing. I’d like to thank Microsoft of listening to feedback from MVPs and community and for including Tabular in Standard Edition! For more details, take a look at the SQL Server 2016 Editions document.

050216_1703_SQLServer201.png

 

Drilling Through in Power BI

Excel users are familiar with drilling through details by double-clicking a cell in a PivotTable report or PivotChart data point. Power BI has silently added a similar feature to let users see the level of detail behind a chart data point in both Power BI Desktop and Power BI Service. This option can be enabled per visual by turning on the See Records option in Power BI Desktop or the same option from the Explore menu when viewing a report in powerbi.com. Interestingly, it’s also available when you right-click a data point in Power BI Desktop (this is the first time I’ve seen a context menu to work in Power BI).

050216_0146_DrillingThr1.png

Once you enable it, clicking on a chart data point doesn’t trigger interactive highlighting (the default behavior). Instead, it navigates you to a new page that auto-generates a Table report whose filters inherit the filters on the main report. Interestingly, drilling through works even the main report has a multi-valued filter (it took Excel all the way to 2016 to support multi-valued filters for drilling through).

050216_0146_DrillingThr2.png

I think Power BI defaults to 1,000 rows by default and I don’t see a way to change this limit. It the drillthrough action results in more rows, a warning icon is displayed in the upper left corner of the report. It’s not clear how Power BI preselects which columns to show in the drillthrough report but you can always use the Fields list to add/remove fields. That’s another great feature that Excel drillthrough didn’t support and required custom SSAS actions to support. Also, instead of auto-generated field captions the report shows the actual field names (another welcome feature that Excel doesn’t support). Unfortunately, currently Power BI doesn’t support drilling through cells in Tabular and Matrix reports.

Power BI “See Records” feature adds drilling-through capabilities to chart and map visuals to allow users to see the rows that contribute to the aggregated value. Users would love this feature as it’s more versatile than Excel drill-through and completely missing in Power View reports.