Posts

Rogue Q&A Queries

I’ve noticed severe performance degradation after refreshing a Power BI Desktop model with some five million rows. The Power BI Desktop process showed a sustained 50-60 % utilization for minutes in the Windows Task Manager. I did a profiler trace and I saw expensive DAX queries like these:

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[PONumber]),LEN(‘Sales'[PONumber])<=100),”valueColumn”,’Sales'[PONumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[SalesOrderNumber]),LEN(‘Sales'[SalesOrderNumber])<=100),”valueColumn”,’Sales'[SalesOrderNumber])

EVALUATE SELECTCOLUMNS(FILTER(VALUES(‘Sales'[InvoiceNumber]),LEN(‘Sales'[InvoiceNumber])<=100),”valueColumn”,’Sales'[InvoiceNumber])

As it turned out, Power BI Desktop autogenerates these queries when building a Q&A index. The 100-size limit is because Power BI wants to keep the index small. In addition, values that are longer than 100 characters are unlikely to be asked by the user. Why not check thd the maximum column value and skip the column? Power BI wants to skip instances that are too long but still index the remaining instances of the column.

To avoid this performance degradation when modeling on the desktop you could disable the Q&A feature. This will also disable smart narratives because they depend on Q&A.

To do this, go to the File, Options and Settings, Options, and turn off the Q&A option.

If Power BI Desktop is connected to a remote model, such as a published Power BI dataset, you’ll see also an option to create a local index. This option was added because Power BI needs to ask user permission to query data from remote sources, build the data index, and store it on user’s machine. By default, it’s disabled until the user explicitly turns on Q&A. For import models, as the data is already on user’s machine, Power BI doesn’t need to ask the permission to query data anymore. That’s why the option to build a local index is not applicable to models with imported data.

Disabling the Q&A in Power BI Desktop affects the local file only. When you publish the model, you reenable Q&A from the data settings if you want end users to use Q&A features. For remote models, if you leave the first option, “Turn on Q&A to ask …”, on, but disable the second option, “Create a local index….”, and publish the model to the service, then Q&A will be enabled in the service by default. That is, you don’t have to go to dataset settings to enable Q&A for that model. For import models, you have to disable the first option, and then after publishing the model to the service, you have to go to dataset settings to enable Q&A there.

Propagating DAX BLANK() Over Relationships

Happy 4th of July with a DAX puzzle!

Consider the following three tables. Notice that InvoiceDetails has a data integrity issue where the third row doesn’t have InvoiceNo (InvoiceNo is blank).

What would a measure with COUNTROWS(Invoice) return for Teo? BLANK(), right? And that’s correct because Teo has no invoices. Now, what would COUNTROWS(InvoiceDetails) return for Teo? BLANK() again given that Teo has no line items? Nope, it returns 1. In other words, it will return the count of all line items whose InvoiceNo is blank. How come?

COUNTROWS(Invoice) returns blank for all customers with no matching rows in Invoice because these customers are mapped to the blank row in Invoice, which is excluded from COUNTROWS by default. But if you change COUNTROWS(Invoice) to COUNTROWS(VALUES(Invoice)), then 1 will be returned for these customers. Then, the filter flows to InvoiceDetals through the blank row and counts all rows with blank InvoiceNo.

This behavior wouldn’t have happened in SQL. However, DAX blank is not SQL null. Blank sales in DAX means no sales or zero sales, but null sales in SQL means unknown sales. Therefore, in DAX blank = blank, but in SQL null is always different than null.

Thanks to the Father of DAX, Jeffrey Wang, for shedding light in dark places.

Atlanta MS BI and Power BI Group Meeting on April 6th

MS BI fans, the time has come for a virtual meeting. Please join us online for the next Atlanta MS BI and Power BI Group meeting on Monday, April 6th, at 6:30 PM. I’ll show you how business analysts can apply AutoML in Power BI Premium to create predictive models. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Bringing Predictive Analytics to the Business User with Power BI AutoML (Virtual Meeting)
Date:April 6th, 2020
Time6:30 – 8:30 PM ET
Place:Join Microsoft Teams Meeting

Learn more about Teams | Meeting options

Computer audio is recommended

Conference bridge number 1 605 475 4300, Access Code: 208547

Overview:With the growing demand for predictive analytics, Automated Machine Learning (AutoML) aims to simplify this process and democratize Machine Learning so business users can create their own basic predictive models. Join this presentation to learn how to apply AutoML in Power BI Premium to predict the customer probability to purchase a product. I’ll show you the end-to-end AutoML process, including:

·       Create a dataflow

·       Choose a field to predict

·       Choose a model type

·       Select input variables (features)

·       Train the model

·       Apply the model to new data

·       Bonus: Integrate Power BI with AzureML

Speaker:Through his Atlanta-based company Prologika (https://prologika.com), a Microsoft Gold Partner in Data Analytics, Teo Lachev helps organizations make sense of their most valuable asset: their data. His strategy formulation, trusted advisory and mentoring, design and implementation services empower clients to apply effectively data analytics in order to understand, improve, and transform their business processes. Teo has authored and co-authored several books on organizational and self-service data analytics, and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Teo has been a Microsoft Most Valued Professional (MVP) Data Platform since 2004.
Prototypes without pizza:Power BI latest features

PowerBILogo

NullToZero in Power BI and Tabular

Null and zero typically have different semantics, where null indicates an unknown or missing value while zero is an explicit value. Sometimes, however, you want to show nulls as zeros. For example, an insurance company might want to show 0 claims instead of a blank value. By default, when a DAX filter doesn’t find any rows, the formula returns null (BLANK in DAX). For example, this Claims Count measure will return null when no claims match the report filters.

Claims Count = DISTINCTCOUNT(FactClaims[UniqueClaimNumber])

How do we show nulls as zeros?

The easiest and fastest way is to simply append zero.

Claims Count = DISTINCTCOUNT(FactClaims[UniqueClaimNumber]) + 0

Note that the storage engine is optimized to eliminate empty spaces so converting measures to zero can impact performance negatively. In addition, by default, reports remove dimension members with empty measures. This won’t happen if measures return zero.

Atlanta MS BI and Power BI Group Meeting on February 3rd

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on February 3rd, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Shabnan Watson will discuss how to apply aggregations to Power BI DirectQuery datasets to improve report performance.  Melissa will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Aggregations in Power BI
Date:February 3rd, 2020
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Aggregations are one of the most important optimization methods for managing big datasets in Power BI. Combined with Direct Query storage mode, they allow big datasets to be analyzed efficiently by answering high level analytical queries quickly from memory while sending more detailed queries back to the source database. In this session, you will learn about the concept of aggregations, different table storage modes in Power BI, different kinds of aggregation tables, how to configure aggregation tables so that they can answer high level user queries, and finally how to use tools such as DAX Studio or Extended Events to determine if the aggregations are actually being used.
Speaker:Shabnam Watson is a Business Intelligence consultant with 18 years of experience developing data warehouse and BI solutions. Her work focus within the Microsoft BI Stack has been on Analysis Services and Power BI. She is an active member of PASS community and has spoken at PASS Summit, PASS SQL Saturdays, PASS Women In Technology Virtual Chapter, and other Local user groups. She is one of the organizers of SQL Saturday Atlanta and SQL Saturday Atlanta BI Edition. She holds a master’s degree in computer science, a bachelor’s degree in Computer Engineering, and a Certified Business Intelligence Professional (CBIP) certification by The Data Warehouse Institute (TDWI).
Sponsor:Bad data is bad business. Melissa helps organizations profile, cleanse and verify, dedupe and enrich all their people data (name, address, email and phone number) and more.  With clean, accurate and up-to-date customer information, organizations can monetize Big Data, improve sales and marketing, reduce costs and drive business insight. https://www.melissa.com/
Prototypes with PizzaTBD

PowerBILogo

Atlanta MS BI and Power BI Group Meeting on January 6th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on January 6th, Mondayat 6:30 PM at the Microsoft office in Alpharetta. I’ll introduce to Power BI Premium Automated Machine Learning (AutoML). Prologika will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Power BI Automated Machine Learning (AutoML)
Date:January 6th, 2020
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:With the growing demand for predictive analytics, Automated Machine Learning (AutoML) aims to simplify this process and democratize Machine Learning so business users can create their own basic predictive models. Join this presentation to learn how to apply AutoML in Power BI Premium to predict the customer probability to purchase a product. I’ll show you the end-to-end AutoML process, including:

·       Create a dataflow

·       Choose a field to predict

·       Choose a model type

·       Select input variables (features)

·       Train the model

·       Apply the model to new data

·       Bonus: Integrate Power BI with AzureML

Speaker:Through his Atlanta-based company Prologika (https://prologika.com), a Microsoft Gold Partner in Data Analytics, Teo Lachev helps organizations make sense of their most valuable asset: their data. His strategy formulation, trusted advisory and mentoring, design and implementation services empower clients to apply effectively data analytics in order to understand, improve, and transform their business processes. Teo has authored and co-authored several books on organizational and self-service data analytics, and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Teo has been a Microsoft Most Valued Professional (MVP) Data Platform since 2004.
Sponsor:Prologika is one of the most trusted names in Data Analytics. Our clients, from small businesses to Fortune 100 enterprises, derive tremendous value from our services. Our mission is to help organizations make sense of data by applying the latest technologies for descriptive and predictive analytics and get actionable insights. Your organization will spend less time mining for information and be better equipped to make sound business decisions. https://prologika.com

PowerBILogo

Atlanta MS BI and Power BI Group Meeting on December 2nd

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on November 4, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Stacey Jones will present Power BI options with Python. Accelebrate will sponsor the meeting. And I will share some tips demoing the latest Power BI Desktop features, such as the new ribbon, decomposition tree and AI integration. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Integrating Power BI with Python
Date:December 2nd, 2019
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Python is well suited for Data Science and big data professionals. It has been voted as the most popular programming language in 2019. Microsoft made big investments in open-source R and Python, especially to extend Power BI. Join this session to learn how you can integrate Python with Power BI. Learn how to use Python in these ways:

·       Use Python as a data source

·       Transform data

·       Produce beautiful visualizations

Speaker:Stacey Jones specializes in mentoring and guiding firms in their efforts to build a modern Data, AI & BI governance programs that empower their business with Self-Service BI and Data Science capabilities. He currently serves as the Principal Data Solutions Architect at the Atlanta Microsoft Technology Center (MTC).
Sponsor:Don’t settle for “one size fits all” training. Choose Accelebrate, and receive hands-on, engaging training precisely tailored to your goals and audience! Our on-site training offerings range from ASP.NET training and SharePoint training to courses on ColdFusion, Java™, C#, VB.NET, SQL Server, and more. Accelebrate.com
Prototypes with Pizza“New ribbon, decomposition tree, and AI integration in Power BI Desktop” by Teo Lachev

PowerBILogo

Atlanta MS BI and Power BI Group Meeting on November 4th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on November 4, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Andy Lawrence will share best practices for impactful Power BI Dashboards. CCG Analytics will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Best Practices for Impactful Power BI Dashboards
Date:November 4, 2019
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Power BI is gaining momentum as a preferred tool for dashboards and interactive reports. Let’s revisit some best practices for dashboard development, such as:

·       The importance of form and function

·       Facilitating user adoption

·       Mistakes that everyone makes

·       Fast shortcuts for clean reports

·       Hidden settings that are lifesavers

·       Best Power BI updates of 2019

·       Live demo of a fast dashboard build

·       Questions

Speaker:Andy Lawrence is a senior Power BI consultant at CCG Analytics and the leader of the Tampa Power BI user group. He’s a Florida native and a proud UF Gator (MBA) and USF Bull (MIS). At CCG he provides guidance on data modeling, tabular environments, azure administration, DAX writing, T-SQL and data visualization best practices. All of which he can expand upon if you have questions during his presentation.
Sponsor:CCG specializes in deploying solutions that not only provide value to the business but are adopted by users ensuring accountability of the IT driven system.  Moving beyond reporting, our Business Intelligence solutions support data governance, quality and standardization across the organization and enable stakeholders with tools like predictive analytics, user-defined alerts, data mining, what-if analysis and visually appealing dashboards.
Prototypes with Pizza“Lineage view” by Teo Lachev

PowerBILogo

Filtering Power BI Visuals by Measure

Power BI lets you filter the visual data by a measure, such as by a measure that dynamically calculates the customer’s rank. This is convenient but there are performance implications related to the way Power BI autogenerates the DAX query. If a visual has a visual-level filter on a measure, Power BI Desktop defines a variable table filter that includes all measures in the visual. In the following example, the visual filters on the IncludedInRank measure, but the filter table includes all measures in the visual:

VAR __ValueFilterDM1 =
FILTER (
KEEPFILTERS (
SUMMARIZECOLUMNS (
‘Customer'[PARENT_NAME],
‘Customer'[Label_DM],
‘Customer'[VP_NAME],
__DS0FilterTable,
__DS0FilterTable2,
__DS0FilterTable3,
__DS0FilterTable4,
“Vol_Gr_WKND”, ‘NRT_PRR_Parent'[Vol_Gr_WKND],
“Vol_Gr_WE_Mon”, ‘NRT_PRR_Parent'[Vol_Gr_WE-Mon],
“Vol_Gr_WE_Tue”, ‘NRT_PRR_Parent'[Vol_Gr_WE-Tue],
“Vol_Gr_WE_Wed”, ‘NRT_PRR_Parent'[Vol_Gr_WE-Wed],
“Vol_Gr_WE_Thu”, ‘NRT_PRR_Parent'[Vol_Gr_WE-Thu],
“Vol_Gr_WE_Fri”, ‘NRT_PRR_Parent'[Vol_Gr_WE-Fri],
“WTD_Gr_ptc”, ‘NRT_PRR_Parent'[WTD_Gr_ptc],
“IncludedInRank”, IGNORE ( ‘NRT_PRR_Parent'[IncludedInRank] )
)
),
[IncludedInRank] = 1
)

As a result, these measures are evaluated twice: once in the filter table and twice in the main query (EVALUATE clause). This could definitely impact performance if the measures are slow to begin with. While waiting on Microsoft to provide more control over the autogenerated query, one approach to mitigate the performance impact is to use the Power BI-provided TOPN filter. In this example, instead of sorting and filtering on IncludedInRank to find the top N customers, you would apply a TOPN filter on Customer[Parent Name] based on the IncludedInRank measure. When TOPN is used, the autogenerated filter table will filter only on the measure used for TOPN, thus avoiding evaluating the rest of the measures.

Atlanta MS BI and Power BI Group Meeting on October 7th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on October 7, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Qubole will show us how Presto, Azure Data Lake, and Power BI can be used to analyze Big Data. Qubole will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Leveraging Power BI on Presto for the Azure Data Lake
Date:October 7, 2019
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)
8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009
Overview:Presto is a distributed ANSI SQL engine designed for running interactive analytics queries. Presto outshines other data processing engines when used for business intelligence (BI) or data discovery because of its ability to join terabytes of unstructured and structured data in seconds, or cache queries intermittently for a rapid response upon later runs. Presto can also be used in place of other well-known interactive open-source query engine such as Impala, Hive or traditional SQL data warehouses. Attend this event to learn:

·        Why Presto is better suited for ad-hoc queries than other engines like Apache Spark

·        How to jumpstart analysts across your organization to harness the power of your big data

·        How to generate interactive or ad hoc queries or scheduled reports using Presto

·        Real-world examples of companies using Presto

Speaker:Man Zhang is a Solutions Architect at Qubole. He has 20+ years in software systems architecture, development, and integration and 4+ years in Big Data architecture. https://www.linkedin.com/in/man-zhang-34a887/
Sponsor:Qubole delivers a Self-Service Platform for Big Data Analytics built on Amazon Web Services, Microsoft and Google Clouds. We were started by the team that built and ran Facebook’s Data Service when they founded and authored Apache Hive.  With Qubole, a data scientist can now spin up hundreds of clusters on their public cloud of choice and begin creating ad hoc and/or batch queries in under five minutes and have the system autoscale to the optimal compute levels as needed.  Please feel free to test Qubole Data Services for yourself by clicking “Free Trial”​ on the website.
Prototypes with PizzaTBD

DCI_PowerBI