Posts

Atlanta MS BI and Power BI Group Meeting on February 1st

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, February 1st, at 6:30 PM.  Paul Turley (MVP) will show you how to use Power Query to shape and transform data. For more details, visit our group page.

Presentation:Preparing, shaping & transforming Power BI source data
Date:February 1st, 2021
Time6:30 – 8:30 PM ET
Place:Click here to join the meeting

Learn More | Meeting options

Overview:In a business intelligence solution, data must be shaped and transformed. Your source data is rarely, if ever, going to be in the right format for analytic reporting. It may need to be consolidated into related fact and dimension tables, summarized, grouped or just cleaned-up before tables can be imported into a data model for reporting.

·       Where should I shape and transform data… At the source? In Power Query, or In the BI data model?

·       Where and what is Power Query? Understand how to get the most from this amazing tool and how to use it most efficiently in your environment.

·       Understand Query Folding and how this affects the way you prepare, connect and interact with your data sources – whether using files, unstructured storage, native SQL, views or stored procedures.

·       Learn to use parameters to manage connections and make your solution portable. Tune and organize queries for efficiency and to make them maintainable.

Speaker:Paul (Blog | LinkedIn | Twitter) is a Principal Consultant for 3Cloud Solutions (formerly Pragmatic Works), a Mentor and Microsoft Data Platform MVP. He consults, writes, speaks, teaches & blogs about business intelligence and reporting solutions. He works with companies around the world to model data, visualize and deliver critical information to make informed business decisions; using the Microsoft data platform and business analytics tools. He is a Director of the Oregon Data Community PASS chapter & user group, the author and lead author of Professional SQL Server 2016 Reporting Services and 14 other titles from Wrox & Microsoft Press.  He holds several certifications including MCSE for the Data Platform and BI.
Prototypes without pizza:Power BI Latest

PowerBILogo

Calculation Group Gotchas

The moment you add a calculation group to your model, Power BI sets DiscourageImplicitMeasures = True on the model. Although this property can trick you to be believe that they are still supported, you can’t create implicit measures, such as by dragging a numeric field on the report to summarize that field. That’s because implicit measures are created as inline calculations which calculation groups don’t support.

Also, there is a current issue where when you add a column from a calculation group to a filter, “Require single selection” is set to on and it can’t be changed. Therefore, you won’t be able to filter multiple calculation items, such as to present t only MTD, QTD, and YTD from a list of many items in your calculation group. As a workaround, you add a calculated column that flags the desired values and filter on it. You can vote to expedite the fix here.

Rogue Q&A Queries

I’ve noticed severe performance degradation after refreshing a Power BI Desktop model with some five million rows. The Power BI Desktop process showed a sustained 50-60 % utilization for minutes in the Windows Task Manager. I did a profiler trace and I saw expensive DAX queries like these:

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[PONumber]),LEN(‘Sales'[PONumber])<=100),”valueColumn”,’Sales'[PONumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[SalesOrderNumber]),LEN(‘Sales'[SalesOrderNumber])<=100),”valueColumn”,’Sales'[SalesOrderNumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[InvoiceNumber]),LEN(‘Sales'[InvoiceNumber])<=100),”valueColumn”,’Sales'[InvoiceNumber])

As it turned out, Power BI Desktop autogenerates these queries when building a Q&A index. The 100-size limit is because Power BI wants to keep the index small. In addition, values that are longer than 100 characters are unlikely to be asked by the user. Why not check thd the maximum column value and skip the column? Power BI wants to skip instances that are too long but still index the remaining instances of the column.

To avoid this performance degradation when modeling on the desktop you could disable the Q&A feature. This will also disable smart narratives because they depend on Q&A.

To do this, go to the File, Options and Settings, Options, and turn off the Q&A option.

If Power BI Desktop is connected to a remote model, such as a published Power BI dataset, you’ll see also an option to create a local index. This option was added because Power BI needs to ask user permission to query data from remote sources, build the data index, and store it on user’s machine. By default, it’s disabled until the user explicitly turns on Q&A. For import models, as the data is already on user’s machine, Power BI doesn’t need to ask the permission to query data anymore. That’s why the option to build a local index is not applicable to models with imported data.

Disabling the Q&A in Power BI Desktop affects the local file only. When you publish the model, you reenable Q&A from the data settings if you want end users to use Q&A features. For remote models, if you leave the first option, “Turn on Q&A to ask …”, on, but disable the second option, “Create a local index….”, and publish the model to the service, then Q&A will be enabled in the service by default. That is, you don’t have to go to dataset settings to enable Q&A for that model. For import models, you have to disable the first option, and then after publishing the model to the service, you have to go to dataset settings to enable Q&A there.

Atlanta MS BI and Power BI Group Meeting on January 4th

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, January 4th, at 6:30 PM.  James Serra (Big Data/Data Warehouse Evangelist at Microsoft) will share best practices around staging data in an organizational data lake. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Data Lake Overview
Date:January 4th, 2020
Time6:30 – 8:30 PM ET
Place:Click here to join the meeting

Learn More | Meeting options

Overview:The data lake has become extremely popular, but there is still confusion on how it should be used. In this presentation I will cover common big data architectures that use the data lake, the characteristics and benefits of a data lake, and how it works in conjunction with a relational data warehouse. Then I’ll go into details on using Azure Data Lake Store Gen2 as your data lake, and various typical use cases of the data lake. As a bonus I’ll talk about how to organize a data lake and discuss the various products that can be used in a modern data warehouse.
Speaker:James Serra is a big data and data warehousing solution architect at Microsoft.  He is a thought leader in the use and application of Big Data and advanced analytics. Previously, James was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 35 years of IT experience. James is a popular blogger (JamesSerra.com) and speaker. He is the author of the book “Reporting with Microsoft SQL Server 2012”.
Prototypes without pizza:Power BI Latest

PowerBILogo

Atlanta MS BI and Power BI Group Meeting on December 7th

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, December 7th, at 6:30 PM.  Patrick LeBlanc (A Guy in the Cube) will share techniques to optimize your Power BI data models. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Optimizing the size of your model
Date:December 7th, 2020
Time6:30 – 8:30 PM ET
Place:Click here to join the meeting

Learn More | Meeting options

Overview:When working with your Power BI Data Model/Dataset there are certain that can be done to optimize the size of the model.  With that, there are certain thing that can be done that wreaks havoc on your Data Model.  In this session we will walk you through several things that can be done to ensure that your data model is optimize for the best performance.  We will discuss and demonstration how items such as data types, model properties, and DAX calculations and adversely affect the size of the model.  That’s just a small list of items, join the meeting to learn all the tips and tricks.
Speaker:Patrick LeBlanc is a currently a Principal Program Manager at Microsoft and a contributing partner to Guy in a Cube. Along with his 15+ years’ experience in IT he holds a Masters of Science degree from Louisiana State University. He is the author and co-author of five SQL Server books. Prior to joining Microsoft he was awarded Microsoft MVP award for his contributions to the community.  Patrick is a regular speaker at many SQL Server Conferences and Community events.
Prototypes without pizza:“Power BI Latest” by Teo Lachev

PowerBILogo

Understanding Power BI Endpoints

In its early days, Power BI introduced an endpoint to support Analyze in Excel (AIXL). Later, the “Power BI datasets” connector relied on this endpoint to support connecting to published datasets. The AIXL endpoint was never intended to support other clients. It has a few limitations, such as it doesn’t support long running requests and write operations. Also, it doesn’t support importing data as you’ve probably found when connecting to published datasets.

Later, Power BI Premium added the XMLA endpoint to support external clients connected to datasets in a premium capacity. If you’re on Premium, you should use the XMLA endpoint to connect external clients by using the Azure Analysis Services connector. Unfortunately, if you do so in PBI Desktop, “Get Data” won’t give you the nice UX that shows you which datasets are certified.

It will be nice if Microsoft adds a mechanism in the future where the client libraries will automatically take the new style of connection string (workspace + dataset name) and redirect to the AIXL endpoint if the dataset is hosted in a shared capacity or the XMLA endpoint if the dataset is in Premium. Or even better, support the XMLA endpoint for both shared and premium capacities.

In summary:

  • AIXL and PBI Desktop use the AIXL connections.
  • You should never use AIXL connection strings directly.
  • Business users should continue using the “Power BI datasets” connector.
  • Other external clients should use the XMLA endpoint to connect to a dataset in a premium capacity.

Updated Export Settings

A Happy Thanksgiving to all of you! I’m thankful for your interest in and support of my work.

I’m glad that Microsoft has recently updated the Power BI tenant export settings and thus mitigated the the complaints I had in my “A False Sense of Data Security” blog. Disabling the “Export to …” settings, which now applies also to paginated reports, no longer disables live connections to published datasets and features that depend on it, such as connecting to published datasets in Power BI Desktop, accessing the Power BI Premium XMLA endpoint, and Analyze in Excel. Instead, there is a clear distinction and  now there two separate settings that affect external connectivity (for XMLA connectivity, the XMLA Endpoint capacity setting must be enabled in Read-Only or Read-Write modes):

  • Allow live connections – This is a catch-all setting for allowing the live connectivity features.
  • Allow XMLA endpoints and Analyze in Excel with on-premises datasets — Microsoft felt that there should be a separate setting (besides “Allow live connections”) for connecting to on-prem datasets. However, just like disabling “Allow live connections”, this setting also effectively disables the Power BI Premium XMLA endpoint preventing other tools, such as Visual Studio or SSMS, to connect to published datasets.

Stop Using the Power BI Folder Connector

I know that it’s tempting to use the Folder connector to combine multiple files with the same schema and load them in Power BI Desktop with a few clicks – an idea promoted by Microsoft’s own Power BI Dashboard in a Day (DIAD) training.

But what happens if you want to automate the dataset refresh after uploading your model to Power BI Service? Suddenly, things don’t work so well anymore. The only option is to install a personal gateway on your laptop with all the issues surrounding it.

Instead, I recommend you upload files to Azure Data Lake Storage and use the Power BI Azure Data Lake Storage Gen2 connector. Setting this up will require some assistance from your friendly IT department to set up ADLS and grant you access, but they shouldn’t complain if they are serious about data governance.

Like the Folder connector, you can combine multiple files from an ADLS folder. Moreover, this deployment options offers several benefits over keeping files locally:

  1. Eliminates gateways.
  2. Allows files to be centralized and organized nicely in folders.
  3. Enables integration scenarios, such as a third-party system pushing files on a schedule.
  4. Provides secured access and supervision by IT.
  5. Supports CDM folders in case you ever want to go there, such as to load files from Power BI dataflows or Dynamics CRM.
  6. It’s a highly available, performant, and scalable.

Atlanta MS BI and Power BI Group Meeting on October 5th

Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, October 5th, at 6:30 PM. Paco Gonzalez (CEO of SolidQ) will present and demonstrate the Power BI AI and ML capabilities. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:AI, ML and Power BI
Date:October 5th, 2020
Time6:30 – 8:30 PM ET
Place:Join Microsoft Teams Meeting

Learn more about Teams | Meeting options

Overview:Perhaps some of the most powerful features of Power BI involve the Artificial Intelligence and Machine Learning functionality built into the platform. In this session, we will review the AI and ML capabilities built into Power BI Desktop and the Power BI Service. From Data Visualizations to Data Preparation and Data Modeling, we will discuss the various ways that AI and ML can be easily implemented in your reporting.
Speaker:Paco Gonzalez is the CEO of SolidQ North America, and a Microsoft Data Platform MVP. Focused on Business Analytics and Artificial Intelligence, he specializes in helping organizations become data driven from a strategic and technical perspective. Paco is a speaker at small and large conferences such as PASS Summit, Ignite, and Business Applications Summit, and he has published several books and whitepapers. He is based in Atlanta, GA.
Prototypes without pizza:Power BI latest features

PowerBILogo

Organizational Semantic Model

I am delivering a data governance assessment for an enterprise client. As a part of the effort to migrate reporting from MicroStrategy to Power BI, the client wants to improve data analytics. The gap analysis interviews with the business leaders revealed common pitfalls: no single version of truth, data is hard to come by, business users don’t know what data sources exist, business users spend more time in data wrangling than analytics, data quality is bad, IT is overwhelmed with report requests, report proliferation and duplication, and so on…

Sounds familiar? As I mentioned many times in my blog, an enterprise data warehouse (EDW) plays a critical role in overcoming the above challenges, but it’s not enough. A semantic model is needed and I extolled its virtues in my “Why Semantic Layer?” newsletter. In the Microsoft BI world, Analysis Services Tabular is commonly used to implement such models that are typically layered on top of EDW . In general, there are two ways to approach the model implementation:

  1. (Self-service BI path) Business users create self-service semantic models using Power BI Desktop. Behind the scenes, Power BI creates databases hosted in the Analysis Services Tabular server from the *.pbix files.
  2. (Organizational BI path) BI developers implement organizational semantic models.

Since both implementation paths lead to the same technology, it boils down to ownership, vision, and purpose.

Because IT is overwhelmed, the temptation is to transfer the semantic model development to business users. The issue with this approach is business users seldom have the skills, time, and vision to do so. And the end, mini “semantic models” (“spreadmarts”) are produced and the same problems are perpetuated.

In most cases, my recommendation is for IT to own the semantic model because they own the data warehouse and the “Discipline at the Core” vision. And yes, unless operational and security requirements dictate otherwise, it should strive for a single centralized semantic model that spans all subject areas. If the technology you use for semantic modeling can’t deliver acceptable performance with large models, then it’s time to change it.

Of course, not all data exists or will exist in EDW. This is where self-service “Flexibility at the Edge” comes in. I have high expectations for the forthcoming “Composite models over Power BI datasets and Azure Analysis Services” mega feature (public preview expected in November 2020). This will enable the following scenario that Power BI cannot deliver today:

  1. Business user starts by connecting live to corporate data in the organizational semantic model. Every new report requirement should start with evaluating if all or some of the data is in the semantic model, and if so, instructing the user to connect to the semantic model to avoid data modeling and data duplication.
  2. Business user wants to mash up this data with some data that is not in EDW by retaining the live connection to the organizational semantic model and importing (or connective live with DirectQuery) to other datasets.

This will give us the best of both worlds and help us achieve “Discipline at the Core, Flexibility at the Edge“.