Presenting at SQL Saturday BI Atlanta

Come and join me on Saturday, January 9 th at the first SQL Saturday BI edition in Atlanta. You’ll learn about the exiting new BI changes coming to SQL Server 2016 and the Microsoft on-premises roadmap!

The Best Self-Service BI Tools of 2015

I came across this interesting PC Magazine article that just came up to compare 10 popular self-service BI tools. And the winner is? Power BI, of course, rubbing shoulders with Tableau for the Editor Choice award! The author, David Strom, did a great job reviewing the tools (this is not a trivial undertaking) but a few Power BI conclusions deserve clarifications:

  • Cons: “Cloud version has a subset of features found in Windows version” – The cloud version is meant to be simple on purpose so that business users can start analyzing data without any modeling.
  • Sharing: “Microsoft relies on the shared Microsoft OneDrive at Microsoft cloud service (or what it calls a “content pack”) to personalize and share your dashboard and reports via unique URLs” Power BI doesn’t rely on OneDrive for collaboration. Instead it supports three ways to share content: simple dashboard sharing, workspaces, and content packs.
  • Custom visuals: “You can get quickly up to speed by searching through an online visualizations gallery to find the particular presentation template you want to use to show your data. This is the reverse of what many BI tools such as Tableau Desktop ($999.00) at Tableau Software and Domo ($2,000.00) at Domo have you do, and it takes a bit of getting used to.” Not sure what this refers to. There are built-in visualizations and starting up with them is no different than using other tools. But we have also custom visuals that no other vendor has.
  • Custom visuals:” A new section called “Developer Tools” lets you build custom visualizations using a Visual Basic-like scripting language that is documented in a GitHub project. While it is still in beta, it could be a very powerful way to add your own custom look to your dashboards” The Dev Tools for implementing custom visuals outside Visual Studio is in preview but the actual visualization framework is not. And developers use TypeScript (superset of JavaScript) and not Visual Basic.

Speaking about reviews, here are some important Power BI characteristics that make it stand above the rest of the pack:

  1. Data engine and DAX – no other tool can come close to the Power BI in-memory engine that allows data analysts to build data models that are on a par with professional models.
  2. Hybrid architecture that allows to connect your visualizations to on premise data sources.
  3. Self-service ETL with Power Query – as far as I know, no other tool has such capabilities.
  4. Open architecture that allows developers to extend the Power BI capabilities
  5. Great value proposition that follows the freemium model – Power BI Desktop is free, Power BI Mobile is free, Power BI service is mostly free.

pcmag

Getting Rid Of Custom Visuals

Scenario: You might have imported a custom visual in Power BI Desktop, tested it, and decided not to use it. However, even if your reports don’t use the visual anymore, Power BI will still prompt you if you want to enable custom visual with “This report contains a custom visual not provided by Microsoft…”. This is a security warning to avoid malicious code because custom visuals are deployed in Javascript.

Currently, there is no way in Power BI to disable this prompt. To make things worse, neither Power BI Service nor Power BI Desktop have a feature to get rid of the custom visual once it’s added to a Power BI Desktop file.

Solution: Here are the manual steps are followed to get rid of custom visuals in Power BI Desktop file for good:

  1. Copy the Power BI Desktop (*.pbix) file. Rename the file to have a zip extension, e.g. from Adventure Works.pbix to Adventure Works.pbix.zip.
  2. Unzip the file.
  3. In the uncompressed folder where you unzip the file content, navigate to the Report folder, and open the Layout file in your favorite text editor.
  4. At the top of the file content, find a resourcePackage string that includes the visual name (you could search for the name of the visual to locate it). For example, the resourcePackage element might look like this for the Sparkline visual:
    ,”resourcePackages”:[{“resourcePackage”:{“name”:”Sparkline1444636326814″,”items”:[{“path”:”icon.png”,”type”:3},{“path”:”Sparkline.js”,”type”:0}
  5. Carefully, delete this entire string but make sure that you don’t end up with two commas or you miss a comma after the deletion.
  6. While you’re in the uncompressed file content, delete also a folder that has the same name as the visual. Strictly speaking, this step is not needed to avoid the prompt but it’s a good idea to clean up all visual files so that you don’t distribute the visual Javascript source.
  7. Zip the entire content again. For some obscure reason, besides getting rid of the visual, in my case compressing the file reduce the PBI Desktop file size almost twice! This reduces the time to upload the visual to Power BI Service.
  8. Rename the file back to the original file name without the zip extension.

Now when you deploy the PBI Desktop file to Power BI and view its reports, you shouldn’t get prompted anymore.

Implementing User Friendly Names in Tabular

Scenario: You’d want to have user-friendly field names in Tabular, Power Pivot, and Power BI Desktop, such as Claim Amount as opposed to database column names, such as ClaimAmount or Claim_Amount. Multidimensional has a feature that automatically delimits words with spaces when it detects a title case or underscore but Tabular lacks this feature. While you can rename fields in Tabular on field at the time, each step requires a commit action, thus taking long time to rename all fields.

Solution: While I’m not aware of a tool for renaming fields, the following approach should minimize the tedious work on your part:

  1. Wrap your table with a SQL view. It’s a good practice anyway.
  2. Alias the table columns. If you have a lot of columns, the easiest way to alias your columns is to use vertical copy and paste.
    1. In SSMS, script the table as SELECT TO. This generates the SELECT statement in a suitable format for the next steps (column names enclosed in square brackets, comma on the left).
    2. Hold the Alt key and select all columns by doing a vertical selection to enclose all column names, excluding the commas.
    3. Press Ctrl-C to copy.
    4. Hold the Alt key again. Click a place to the right of the first column and drag the mouse cursor down until you reach the last row where the last column is. You should see a vertical line going down.
    5. Type ” AS ” without the quotes. The net effect is that SSMS enters AS for each column.
    6. Press Ctrl-V to paste the column names. Here is the net result:
  3. Now you can delimit the words with spaces. But if you have many columns, this can quickly get tedious too. Enter regular expressions.
  4. Hold the Alt key again for vertical selection and select all alias columns, excluding “AS”.
  5. Press Ctlr-H to bring up the SSMS Find & Replace. In the Find field, enter the regular expression ~(\[)[A-Z]. This expression searches for any capital letter after the left square bracket [.
  6. In the Replace field, enter ” \0″ without the quotes. Notice that these is a space before the backslash. This replaces the capital letter match with an empty space and the capital letter.
  7. Check the “Match Case” and “Use Regular Expressions”. Make sure that the “Look In” drop-down is set to Selection to avoid apply the replace to all the text.
  8. Click Replace All. Now you have all words delimited.

The regular expression I use is not perfect. It won’t discriminate capital letters; for example, it will delimit consecutive capital letters, such as ID as I D, but it’s faster to fix the exceptions than doing all replacements manually. And if you end up with a better expression, please send it my way. The last step, of course, is to import the view and not the table in Tabular, Power Pivot, or PBI Desktop.

Microsoft Acquires Metanautix

If you’ve missed the announcement from a couple of weeks ago, Microsoft acquired Metanautix – a startup founded by ex-Google engineers who worked on BigQuery (aka Dremel). Technical details are scarce at point. In fact, the Metanautix website doesn’t exist anymore but there are YouTube videos and slides, such as this one. A while back, I wrote about logical data warehouses, which come in different shapes and names, such as software-defined data marts, distributed data, and, what I call, brute-force queries, such as Amazon QuickSight. It looks like that with this acquisition, Microsoft is hoping to make a step in this direction, especially when it comes to Big Data analysis.

From I was able to gather online to connect the pieces, Metanautix Quest uses a SQL-like language to define tables that point to wherever the data resides, such as in HDFS, flat files, or RDBMS. The syntax to define a table might like this:

DEFINE TABLE t AS /path/to/data/*

SELECT TOP(signal1, 100), COUNT(*) FROM t

I believe that the original Google implementation would leave the data on the Google File System (GFS). However, it looks like Metanautix always brings the data into an in-memory columnar store, similar to how Tabular stores the data. When the user sends a query (the query could relate data from multiple stores), a multi-level serving tree algorithm is used to parallelize the query and fetch the data with distributed joins, as described in more details in the “Dremel: Interactive Analysis of WebScale Datasets” whitepaper by Google. According to the whitepaper, this query execution pattern outperforms by far MapReduce queries.

While I was reading about Metanautix, I couldn’t help but ask myself “how is it different than Tabular if it brings the data in?” Yet, from the announcement:

“With Metanautix technology, IT teams can connect a diversity of their company’s information across private and public clouds, without having to go through the costly and complex process of moving data into a centralized system.”

It might that Metanautix is more scalable when it comes to Big Data although I don’t see how this could happen if the data is not in situ. We shall see as details start coming in. “In the coming months, we will have more to share about how we will bring Metanautix technology into the Microsoft data platform, including SQL Server and the Cortana Analytics Suite” One thing is for sure: as with logical data warehouses, Metanautix won’t solve your data integration challenges and it’s not a replacement for DW. From what I can tell, it could help with ad hoc analysis across distributed datasets without having to build analytical models, with all the pros and cons surrounding it.