Power BI – Page 16

Posts

Atlanta MS BI and Power BI Group Meeting on July 30

July 22, 2018/0 Comments/in Blog, Events/by Prologika - Teo Lachev

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on Monday, July 30th at 6:30 PM at the Microsoft office in Alpharetta (our regular venue room is being renovated). If you haven’t been at the new MS office, prepare to be impressed! Scott Fairbanks from CCG will show you how to create a tailored user experience with Power BI. Your humble correspondent will demonstrate Power BI killer features that will be announced at the Microsoft Business Applications Summit tomorrow. CCG will sponsor the meeting. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the main page) if you’re planning to attend.

Presentation:	How to Create a Tailored End User Experience with Power BI
	Level: Intermediate
Date:	July 30th, 2018
Time	6:30 – 8:30 PM ET
Place:	Microsoft Office (Alpharetta) 8000 Avalon Boulevard Suite 900 Alpharetta, GA 30009
Overview:	Power BI can deliver value in a variety of ways. With its ease of use, high levels of adoption, and growing market share, Power BI is being integrated into Analytics Solutions all over the world. In this session, attendees will see the power in leveraging Workspaces and Apps within Power BI to create a packaged, custom solution tailored to meet the needs of the consumer. With the end user experience in mind, this session will discuss the development process, design considerations, and the Power BI Service deployment of a collection of customized dashboards in a single App. Discussion Points Include: • Power BI Desktop • Power BI Service • Workspaces • Apps • Visualizations • Interactions • Formatting • Mobile View • Row level security • Publishing • Consuming
Speaker:	Scott Fairbanks is a consultant that has been involved in Data and Analytics for 12+ years. Scott has always been interested in computing technology, and fell naturally into the role of helping others get value out of their analytics investment as a consultant with CCG. Over the course of his career, Scott has worked with many of the various BI product suites, including IBM’s Cognos, SAP’s Webi, and Tableau. In recent years, he has been swept up in the demands of the changing analytics environment and has been diving into the Microsoft Azure and Power BI space along with so many companies making the move to the cloud.
Sponsor:	CCG is an award-winning consulting firm with a passion for helping clients solve their operational and strategic challenges through innovative Business Intelligence (BI) services. We believe business intelligence is more than a technical solution; it’s an opportunity to create lasting cultural change across an organization. Our experts focus on quickly gaining a big-picture understanding of your business to develop optimized, tailored BI solutions that empower you to drive greater decision-making at the speed of your data.
Prototypes with Pizza	Power BI killer features in the July 2018 release.

Implementing a Default Date Filter in Tabular and Power BI

July 18, 2018/0 Comments/in Blog/by Prologika - Teo Lachev

Problem: A long standing limitation in Analysis Services Tabular and Power BI is the lack of default members, such as to default a Date dimension to the last date with data. Users find this annoying, especially for time calculations which require a date context and won’t produce results until you add a date-related field to the report one way or other. To aggravate things even more, Tabular switched to a JSON schema in SQL Server 2016. The new schema is great for expressing the metadata in clear and more compact way. However, because it’s not extensiible, it precludes community tools, such as BIDS Helper (now called BI Developer Extensions) and DAX Editor, to extend Tabular features and fill in the gaps left by Microsoft.

Workaround: Currently, there is no way to default a field in Tabular and Power BI. For the most common requirement to default a date filter, consider adding a field to the Date table that flags the desired item. You can implement this in different ways. For example, in Excel you can create a named set, but this requires MDX knowledge that end users won’t have. And it won’t work in other report clients, such as Power BI Desktop. Here is an approach that is simple to implement and works everywhere.

Flag the desired item using whatever way you’re comfortable with: SQL view, Power Query, or DAX. For example, if users prefer the date filter to default to yesterday and the Date table is based on a SQL view, add the following column to the view:
CASE WHEN [Date] <= DATEADD(DAY,-1, CAST(GETDATE() AS DATE)) THEN ‘Yes’ ELSE ‘No’ END YesterdayFlag
Notice that the flag qualifies all rows prior to and including yesterday. In your first attempt, you might try only an equal condition. This will work when the YesterdayFlag field is added as a filter to an Excel pivot table. However, if you attempt to configure it as an Excel slicer, it will result in the following error:
Calculation error in measure ‘<your measure name>’: Function ‘SAMEPERIODLASTYEAR’ only works with contiguous date selections.
Reimport the Date table in Tabular or Power BI.
Set up a report filter. For example, in Excel define a slicer using the YesterdayFlag field and default it to Yes, if you want to filter all reports on the same sheet.

The net effect is that as time progresses, the YesterdayFlag field will filter the Date table and measures will show the yesterday’s data by default. Time calculations will also work as of yesterday. For example, YTD calculations will aggregate from the beginning of the year until yesterday. And end users don’t need to bother about resetting date filters if viewing data as yesterday is the prevailing preference.

Determining Power BI Relationships

July 16, 2018/2 Comments/in Blog/by Prologika - Teo Lachev

Problem: Consider the following Power BI report that shows the sum of ResellerSales[SalesAmount] by SalesTerritory[SalesTerritoryCountry], Employee[FullName], and ResellerSales[SalesOrderNumber].

This report is based on the following schema consisting of three tables.

However, If there isn’t a direct relationship between ResellerSales and Employee, the moment you add an unsummarized field from the second table on the many side, such as Employee[FullName] after adding SalesTerritoryCountry and ResellerSales[SalesOrderNumber), you’ll get the error “Error: Can’t determine relationships between the fields”.

Solution: Interestingly, the report works fine if a summarized field, such as COUNT(Employee[EmployeeKey]) is used. In this case, the SalesTerritory dimension acts as a conformed dimension joined to two fact tables. The reason why it doesn’t work when Employee[FullName] is added is because there is no aggregation on the Employee table and the relationship between ResellerSales[SalesOrderNumber] and Employee[FullName] becomes Many:Many over SalesTerritory which is now a bridge table. One employee may be associated with multiple sales and a sale can be associated with multiple employees. How do we solve this horrible problem? There are at least two ways:

If possible, create a direct relationship between ResellerSales and Employee. Now you have two dimensions joining a fact table and there is no ambiguity.
If the only join option is to relate Employee to SalesTerritory, set the relationship’s cross filter direction to Both. This will indirectly filter the ResellerSales table and might achieve the same result. For each employee on the report, Power BI travels the relationship to find the related territories at the lowest granularity. In Adventure Works, the lowest granularity of the SalesTerritory table is the territory region, such as South East, so the cross filter set to Both will select all regions that are associated to the current employee. From there, it will get the sales orders and sales associated with these regions.

Where the report using the second option differs from the first is if there are multiple employees associated with the same region causing the sales order numbers to repeat although the report total is the same.

Power BI supports flexible schemas but how you construct relationships may impact the report results. You should always start with a simple schema and grow in complexity if needed.

Download the sample

Atlanta MS BI and Power BI Group Meeting on June 25

June 19, 2018/0 Comments/in Blog, Events/by Prologika - Teo Lachev

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on Monday, June 25th at 6:30 PM. Your humble correspondent will share 10 ways Power BI can help augment your existing or envisioned Power BI strategy. DevScope will sponsor the meeting and demo their PowerBI Robots offering. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the main page) if you’re planning to attend.

Presentation:	10 Ways to Empower Your BI Strategy with Power BI
	Level: Intermediate
Date:	June 25, 2018
Time	6:30 – 8:30 PM ET
Place:	South Terraces Building (Auditorium Room) 115 Perimeter Center Place Atlanta, GA 30346
Overview:	Not sure what value Power BI can bring to your BI infrastructure? Join me to discuss 10 ways Power BI can help augment your existing or envisioned Power BI strategy. If you’re interested in the Power BI but you’re not sure how it fits within your organizational data strategy, this event is for you. Discussion points include: • Organizational BI • Self-service BI • Cloud vs. on-premises deployments • Predictive analytics • External reporting • Integrated solutions Get your Power BI questions answered and see demos along the way.
Speaker:	Teo Lachev is a consultant, author, and mentor, with a focus on Microsoft Business Intelligence. Through his Atlanta-based company “Prologika” (a Microsoft Gold Partner in Data Analytics) he designs and implements innovative BI solutions that bring tremendous value to his clients. Teo has authored and co-authored several SQL Server BI books, and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Microsoft has recognized Teo’s expertise and contributions to the technical community by awarding him the prestigious Microsoft Most Valuable Professional (MVP) status since 2004.
Sponsor:	DevScope is a young, dynamic and experienced company, specialized in mentoring and development services in Web environments, and a pioneer in the region, integrating Microsoft technology, products and solutions. DevScope implements business and technology solutions with established and emerging technologies every day. DevScope projects are usually based on the latest technologies available from Microsoft, and many times, those same technologies are not yet available in the market.
Prototypes with Pizza	DevScope Power BI Robots by Rui Romano, DevScope

Common Data Service for Analytics: The Good, the Bad, the Ugly

June 14, 2018/0 Comments/in Blog/by Prologika - Teo Lachev

UPDATE 11/15/2018: Common Data Service for Analytics is superseded by Power BI dataflows. Find the updated review here.

In a previous blog I discussed the Common Data Service for Apps (CDS for Apps). I explained that CDS for Apps is more suitable for OLTP-type applications, which is why its main client is PowerApps apps saving data in normalized tables. Since good things shouldn’t come alone, Microsoft is readying another CDS flavor, Common Data Service for Analytics (CDS for Analytics), which is oriented towards supporting analytical requirements. Microsoft provided a good introduction to CDS for Analytics in the “Common Data Service for Analytics (CDS-A) and Power BI – an Introduction” video and “Introduction to Common Data Service For Analytics” video. Without rehashing what has been already announced and said, I’d like to share a few notes from what I’ve learned so far.

The following table compares the two CDS types.

	CDS for Apps	CDS for Analytics
Primary usage	OLTP	OLAP
Primary tool for loading data	PowerApps/Power Query	Power Query
Primary tool for reading data	PowerApps/Power BI	Power BI
Data storage	Azure SQL Database	Azure Blob Storage (a CSV text file per entity and a JSON file for the schema)
Power BI connectivity mechanism	OData	Azure Blob Storage
Pricing	Included in PowerApps plans	Included in Power BI Pro/Premium
Storage Quota	10 GB per database	Restricted by associated app workspace quota
Add-ons	Logic and validation	Power BI Insights apps

Note that both CDS types target business users willing to store and analyze data in a business-friendly staging database. Over time Microsoft hopes that partners will deliver more value to CDS by implementing apps (CDS apps are like the prepacked apps that already exist in Power BI, such as for Salesforce and Dynamics).

In fact, apps are so important for the success of CDS that Microsoft listed the CDS for Analytics-powered apps, dubbed Power BI Insights, as another product in the Power BI portfolio.

CDS for Analytics is a standard feature of Power BI so every Power BI Pro user can access it. CDS for Analytics is exposed to the end user in Power BI as datapools. A datapool is a collection of entities associated with a Power BI app workspace. An entity maps to a text file in Azure Storage. Business users will rely on Power Query to populate (manually or via a scheduled refresh) entities in CDS for Analytics. You can access the workspace datapool in the workspace content page.

The Good

I can think of three primary scenarios where CDS for Apps can deliver value as it stands today:

Offline data staging – Let’s say IT doesn’t allow direct connectivity to LOB applications but you need to create some reports on top of this data. You can stage the data as text files into CDS for Analytics. I don’t think CDS for Analtyics would bring much value if you could connect directly to it in Power BI Desktop if direct connectivity is an option. The more you move the data, the more problems you may run into. At least for now, having apps on top of text files doesn’t look like a good reason to me but I guess we have to see what apps will become available in time.
Prepackaged third-party solutions – Sometime ago, a software vendor asked me how they can deploy a solution to Power BI for their customers but still retain ownership. Back then I didn’t have a good answer but CDS for Apps might be a good option now. In fact, besides the Power Query as a primary tool for loading entities, any service that can write to Azure Storage can bring data to CDS for Analytics. The ISV can write the entities as CSV files and tell CDS Analytics to “mount” the storage container. CDS Analytics can now see these mounted entities and treat them as part of the whole. Worried about protecting intellectual property? Currently only the Insight App installer would have access to the installed workspace and artifacts (other users in the organization would just see the published reports which are shared with them).
Prepackaged insights – Like CDS for Apps, CDS for Analytics understands the Common Data Model. Over time, Microsoft and partners can contribute prepackaged “insights” that are built on top of popular LOB apps, such as Dynamics or Salesforce.

Pricing is also right. CDS for Analytics is included in Power BI although it storage counts towards the workspace quota. Another thing I like about CDS for Analytics is that the Power BI connector is very fast unlike the CDS for Apps connector.

The Bad

As of now datapools support only a small subset of the Power Query connectors. This is probably just a temporary limitation for the preview cycle. I’d imagine that all Power BI connectors for cloud and on-premises data sources will be eventually available. Continuing on the list of limitations, like CDS for Apps, CDS for Analytics doesn’t support incremental refreshes so be careful downloading millions for rows every night.

The Ugly

CDS for Analytics promises to break silos but a datapool is associated with a Power BI workspace. This architecture fragments CDS for Analytics into Power BI workspaces. However, most users would probably require access to common entities, such as Customer, Product. Not only this is not possible but the datapool storage is also limited by the workspace quota. So, if you are a Power BI Pro user who has access to an app workspace, you’re currently limited to 10 GB storage quota which includes not only Power BI datasets but also CDS entities. I wish that CDS has no association to workspaces and it was designed a global staging area, just like Azure Storage. Microsoft has promised at some point in future to allow you to reference entities between datapools in different workspaces and create calculated entities on top of them.

CDS for Analytics and Power BI Insights are new additions to Power BI. CDS for Analytics delivers Operational Data Store (ODS) to business users that is populated and maintained by business users. Microsoft and partners can augment CDS for Analytics with Power BI Insights apps.

Power BI Feature Discrepancies for Data Acquisition

June 10, 2018/1 Comment/in Blog/by Prologika - Teo Lachev

Power BI churns out new features fast but not all features are available everywhere. As one of the most confusing aspects of Power BI, feature availability depends on the data acquisition method (data import vs live connections) and across Power BI product offerings (Power BI Desktop, Power BI Service, Power BI Report Server, and Power BI Mobile). Microsoft has left gaps in the product documentation and UI to clearly indicate feature availability. For example, while you can add a Q&A button in Power BI Desktop irrespective of whether you import or connect live, the button won’t work with live connections because Q&A for Creators is available only when data is imported.

This blog is my first attempt to clarify the feature availability based on the data acquisition method. All features work when you import data. Therefore, this option is not listed in the table. Direct connections however and not that fortunate. DirectQuery is when you connect directly to all data sources except Analysis Services. Because Power BI knows more about Analysis Services, direct connectivity to Analysis Services is listed separately. A while back I asked Microsoft why certain features are not supported with live connections to Analysis Services, especially to Tabular. To me, there shouldn’t be any difference between data import and using an external Tabular connection. At the end, it’s all Tabular behind the scenes. The answer back then was related to performance concerns for chatty features with external Tabular models. I continue pushing to eliminate these discrepancies if possible over time and provide more feature parity. The table omits features that are supported by the three connectivity options.

	DirectQuery	Live Connection to Analysis Services¹
Binning	Yes	No
Calculated column	Yes²	No
Calculated measure	Yes²	Yes²
Change field data type	Yes	No
Change field formatting	Yes	No
Clustering	No	No
Custom groups	Yes	No
Data categories for fields	Yes	No
Explain Increase/decrease (Power BI Desktop)	No	No
Fields properties	Yes	No
Hierarchies	Yes	No
Power Query	Yes	No
Q&A in dashboards (Q&A for consumers)	No	Tabular only
Q&A in reports (Q&A for creators)	No	No
Quick Insights (Power BI Service)	No	No
Relationships	Yes	No
Row-level Security (Power BI Desktop)	Yes	No³
Synonyms	Yes	No
What-if	No	No

In general, besides calculated measures for Tabular, no modeling features are available with live connections to Analysis Services. This makes sense to avoid a semantic model (Power BI Desktop) over a semantic model in Analysis Services. In fact, Data, Relationships, and Query Editor (Power Query) are not available when connecting directly to Analysis Services.
See this page for DAX limitations in DirectQuery mode.
Analysis Services has its own security mechanism.

Atlanta MS BI and Power BI Group Meeting on May 21

May 14, 2018/0 Comments/in Blog, Events/by Prologika - Teo Lachev

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on Monday, May 21st at 6:30 PM. Paco González will take us into a deep dive into Microsoft Cognitive Services and how to integrate them with Power BI. SolidQ will sponsor the meeting. And your humble correspondent will introduce you to Power BI Premium incremental refresh. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the main page) if you’re planning to attend.

Atlanta BI Group May 21st Meeting Reminder

Please RSVP to help us plan food by 2 PM on the day of the meeting by latest:

Go to the Atlanta BI home page (atlantabi.pass.org).
Choose Yes and submit the RSVP survey found at the right top corner of the page.

Please note that the voting system tracks votes by cookies. If you don’t see the option to submit the survey, clear your browser cookies, or right-click the Internet Explorer icon, and then click Start InPrivate Browsing. Free covered parking is available at South or North Terraces. Or, use the free open non-ticket parking next to North Terraces (coming north on Ashford Dunwoody, turn left on Perimeter Center Terrace and them immediately right).

Presentation:	Artificial Intelligence, Cognitive Services, and Power BI
	Level: Intermediate
Date:	May 21, 2018
Time	6:30 – 8:30 PM ET
Place:	South Terraces Building (Auditorium Room) 115 Perimeter Center Place Atlanta, GA 30346
Overview:	Microsoft Cognitive services is a series of services based on Microsoft Research using Deep Learning techniques, ready for you to enrich your unstructured data. Microsoft Services includes: Vision (Computer Vision, Emotion, Face, Video), Speech/Language (Speaker Recognition, Spell, Sentiment, Language, Key Phrases, Topic detection, Linguistic Analysis), Knowledge, and Search. These services are consumed by using the Cognitive Services APIs. Power BI enables you to consume the APIs to enrich your unstructured data inside your Power BI solutions. During this session, we will cover how to consume the API using Power Query to then visualize the results with Power BI visuals. We will also consume the Cognitive services API by using a R script inside Power BI. This session is structured with a B2C Survey case study solution with a real time Dashboard using a streaming dataset.
Speaker:	SolidQ North America’s CEO, Paco González specializes in helping organizations become “data driven” from a strategic and technical perspective. He is a speaker at small and large conferences such as PASS Summit, Ignite, DevWeek London, or PAW. Paco focuses in Business Analytics and Artificial Intelligence. Paco has published several books and whitepapers. A Microsoft Certified Trainer and Microsoft Certified Professional. Paco is based in Atlanta.
Sponsor:	The tools and technologies to manage information will constantly evolve, but the need for accurate and actionable information is timeless. SolidQ delivers services for Microsoft platforms that help you architect, integrate and optimize your use of data. The result? Our clients think bigger and move faster because we help them build the capacity and skills to interact with data in creative, collaborative ways that deliver new insights to the business. At SolidQ we measure our success by your satisfaction – and we guarantee it.

Microsoft Common Data Service for Apps: The Good, the Bad, the Ugly

May 12, 2018/0 Comments/in Blog/by Prologika - Teo Lachev

BI and data integration projects often benefit from an operational data source (ODS), whose benefits and design I discussed in my “Designing an Operational Data Store (ODS)” newsletter. Think of the Microsoft Common Data Service for Apps (CDS for Apps) as a cloud ODS by Business and for Business. Officially introduced in 2016 and running on Azure SQL Database, CDS for Apps is now the entity and data model behind Dynamics 365. This is where Dynamics 365 stores its data. Microsoft is readying a preview of another CDS flavor (CDS for Analytics) in June 2018. Think of CDS for Apps as the OLTP version of the ODS and is designed for transactions (this is why it’s layered on top of SQL Server), while CDS for Analytics is oriented towards OLAP (this is why is layered on top of Azure Storage).

Both CDS types support standardized entities, whose definitions are documented in the GitHub repository of the Common Data Model. Currently, the schema of these entities is designed and controlled by Microsoft and it’s limited to Dynamics entities, such as Account, Opportunity, and so on. However, Microsoft hopes that other vendors will provide solutions and extend the CDS schema. Of course, because CDS is your database, you can extend it with your own custom entities. The rest of the document focuses on CDS for Apps only.

The Good

There is a lot to like about CDS for Apps. Let’s start with pricing. Other vendors, such as Oracle and Teradata, have similar visions and products but their offerings are very expensive. The CDS for Apps pricing is included in the PowerApps licensing model because PowerApps is the primary client for creating CDS for Apps-centered solutions. Using CDS outside selected Dynamics 365 plans (that include it already) will cost you at least $7 per user and per month.

CDS for Apps is more than just a data repository. It’s a business application platform with a collection of data, business rules, processes, plugins and more. In this regard, it resembles SQL Server Master Data Services (MDS). The modeler can:

Define and change entities, fields, relationships, and constraints. For example, the screenshot shows a custom Device Order entity that I’ve created.
Business rules, such as to prepopulate Ship Date based on Order Date.
Secure data to ensure that users can see it only if you grant them access. Role-based security allows you to control access to entities for different users within your organization.

Besides the original PowerApps canvas apps (like InfoPath forms), CDS for Apps also opens the possibility to create model-driven PowerApps applications (require PowerApps P2 plan). Model-driven apps are somewhat like creating Access data forms but more versatile. Because PowerApps knows CDS for Apps, you can create the app bottom-up, i.e. start with CDS for Apps and then generate the app based on the actual schema and data. For example, you can use PowerApps to build a model-driven app for implementing the workflow for approving a certain process. Model-driven apps are a new style of a PowerApps application that makes it easy to build entity forms, entity views, and workflows.

How do you get data into CDS for Apps custom entities? Your PowerApps app can write to it. Or, you can create and schedule a project that uses Power Query (yep, the same one as in Power BI) to load data from somewhere into CDS for Apps.

The Bad

How do you get data out from CDS for App, such as to import data from some entities into a Power BI model? Microsoft has released a preview build of the Common Data Service for Apps connector for Power BI. However, this connector is even slower that the Dynamics connector. It uses the OData v4 Web API. Based on my limited tests, it took the connector about a minute to download 40,000 rows from Dynamics, clocking 10% slower than the Dynamics connector. To make things worse, the connector doesn’t support query folding, so Power BI must download the entire dataset before Power Query applies filters. Because the connector doesn’t support also REST filter and select predicates, so you can’t filter data or select a subset of columns at the source. Microsoft is actively working on improving the connector performance and it might get better in time.

Continuing down the list of limitations, CDS for Apps doesn’t support change tracking (to capture changes to a given row) and incremental loads, such as to load or refresh only the data that has changed yesterday or previous month. These are all essential features that could make ODS even more valuable.

The Ugly

For years people were complaining that after migrating from the on-premises Dynamics to the cloud, they lost the ability to connect to its database directly and they had to rely on the REST APIs (slow) or Data Export Service to export the data to an SQL Server Database (fast but requires additional effort and budget). Unfortunately, although CDS for Apps stores data in Azure SQL Database, Microsoft doesn’t expose its database directly to get data out fast and bypass the REST endpoint. When I raised this issue to Microsoft I got feedback that CDS for Apps is a business platform and there are layers on top of data to handle security, rules, calculations, and so on. However, the argument that CDS for Apps is more than just a database is nonsensical to me. Try to explain to a customer that cakes have layers and CDS for Apps has layers, and therefore getting something out of it is slow. As I mentioned, the “layered nature” of the CDS is conceptually like MDS. In fact, I see a lot of overlap. MDS also supports rules, security, etc. but it doesn’t force me to go through the web service interface if all I need is the raw data. Hence, my wish to support direct connectivity to the Azure SQL Database endpoint of CDS for Apps.

The success of Common Data Services for Apps will depend largely on adoption and contributions by Microsoft partners. Although it lacks typical ODS features and fast connectivity, CDS for Apps gains in “business platform” features. It will be interesting to see how much traction it will gain with business users as a staging area given that CDS for Apps is designed from ground up for business users.

Another Successful Power BI Workshop

May 11, 2018/0 Comments/in Blog, News/by Prologika - Teo Lachev

Thank you to everyone attending the Power BI workshop by Prologika yesterday. We had another great turnout with more than 50 people from 30+ companies attending the event at the Microsoft Technology Center in Alpharetta! Kudos to Eric Flamm and Michael Folarinde for the great job proctoring the event. Here is some feedback that the attendees shared in the post-event survey:

“I was frustrated and confused with using Power BI service and Desktop because I didn’t know how to manipulate the visualizations, slicers, etc. and thought the tools were not capable. I walk away from this now knowing this information and excited to use the tools to transform data to help our business.”
“Very informative and showed me new things despite me using PBI for a few months.”
“Very instructive and knowledgeable of our instructor Teo. He is very patient and answered all my questions.”
“The presenter was great. He was obviously very knowledgeable, was great at explaining the concepts, and was very patient.”
“The presenter, Teo, was extremely knowledgeable of Power BI. He presented the material in a clear and easy to follow format. He was able to answer all levels of questions from beginner to advanced and even questions regarding infrastructure and how to implement Power BI into small or large organizations. I really enjoyed that we were given time to actually perform the various tasks ourselves and then get help along the way. It really helped to reinforce the lessons we had just learned.”
“…The instructors were very helpful with any questions.”

And my next week is entirely booked with more Power BI training. It’s great to witness and participate in the great momentum surrounding Power BI!

Notes on Power BI Incremental Refresh

May 8, 2018/4 Comments/in Blog/by Prologika - Teo Lachev

In my “Why Business Like Yours Choose Power BI Over Sisense” blog, I discussed how Sisense seeks a competitive advantage by offering a user-friendly designer connected to the their ElastiCube, thus claiming that Sisense is a better choice with larger datasets because business users don’t have to use professional tools (Visual Studio) or deploy to external servers. Things have changed in Power BI since then. Microsoft increased the dataset size to 1 GB (Power BI Pro) and 10 GB (Power BI Premium). A 10 GB compressed dataset is a lot of data (probably allowing you to import over a billion rows). But fully refreshing a large dataset is no fun and it can take many hours (see my blog “Processing AAS Models Asynchronously” for some stats).

Fortunately, the May release of Power BI Desktop brings incremental refresh, which a feature that was previously only available in Analysis Services. As its name suggests, the primary goal is to reduce the time required to refresh datasets with imported data by processing only data that has changed instead of the entire dataset.

I was involved and provided feedback during the prerelease testing of Power BI incremental refresh. Without reiterating the documentation, here are some things you need to pay attention to:

Incremental refresh is a Power BI Premium feature. Although you configure it in Power BI Desktop, you can’t test it in Power BI Desktop. You must deploy the file to a premium workspace to put it in action. I provided feedback that this feature should be also available in shared capacity workspaces because its primary goal is to reduce resources in a shared environment but as it stands incremental refresh is a premium feature.
Associate the RangeStart and RangeEnd query parameters in a custom filter to one column in the table set up for incremental refresh. During development, it makes sense to load a subset of the data, such as for only one year. So, set RangeStart and RangeEnd accordingly.
When you define a custom filter that uses RangeStart and RangeEnd on the table you configure for incremental refresh, triple verify the filter range to ensure that rows don’t overlap. Here is an example of a correct custom filter. If I make a mistake and I change the right boundary to [OrderDate] <= RangeEnd, then I’ll get overlapping rows for the end period. So, make sure that you don’t specify the equal sign in both the left boundary and right boundary.
= Table.SelectRows(dbo_FactResellerSales, each [OrderDate] >= RangeStart and [OrderDate] < RangeEnd)
I suggested Microsoft provide a narrative that restates the configuration of the incremental refresh policy. For example, the configuration below means “This policy will load historical data from Jan 1^st, 1997 to the current date. Subsequent refreshes will process only the last 10 days to the current system date (UTC time).”

When you deploy to *.pbix file to a premium workspace and navigate to the dataset refresh page, you’ll see that the entire Parameters section is disabled. This is a bug which should be fixed by the time incremental refresh goes out of preview. It should remove or disable only the RangeStart and RangeEnd parameters but enable other query parameters. That’s because the RangeStart and RangeEnd parameter values are not used once the model is published.
Because incremental refresh requires a new schema comparability level, you can’t connect to Power BI Desktop models in SSMS. Incremental refresh also breaks the feature for importing Power BI Desktop files in Azure Analysis Services.

What happens behind the hood?

You should appreciate how simple Power BI incremental refresh is to set up. BI pros know that SSAS partitions take some effort to set up. It gets more complicated when you need to set up rolling partitions and merge partitions. In Power BI, you only need to configure the refresh policy and Microsoft takes care of the rest. Behind the scenes, Microsoft introduced a new partition type that has a refresh policy and parameterized query using the RangeStart and RangeEnd parameters. When the model is refreshed, Power BI expands this partition type and creates the actual table partitions. Considering the above setup, it creates 20 yearly partitions, 1 Quarter partition (for Q1 2018), and 35 daily partitions, assuming the current date is May 5, 2018. These partitions get consolidated and merged over time. For example, once Q2 is full, all daily partitions are merged into a monthly partition, and when the quarter is complete, monthly partitions get merged into a quarterly partition, then quarters get merged into years. Power BI also takes care of removing older partitions when they fall off the range (sliding window).

As the documentation mentions, a “preview” limitation is that subsequent publishing from Power BI Desktop replaces the published refresh policy and triggers a full load on the next refresh. Remember that full load happens once the first time you initiate manual or scheduled refresh of the published dataset. Once the dataset is fully loaded, subsequent refreshes load the dataset incrementally (the last 10 days with the above configuration). Currently, there isn’t a way to reload the dataset (full refresh), such as when you discover the historical data has issues and you need to reload the history, except deleting the dataset and republishing it. I hope that Microsoft will enable access to the Analysis Services Tabular management endpoint to support such scenarios. Or, a better option might be to extend the Parameters section in the refresh page to include an option for full load so that the business user can trigger a full reload.

Query Folding

One important consideration to keep in mind is that incremental refresh works best with data sources that support query folding. Such data sources pass through some Power Query transformation steps, such as filtering and grouping, to the underlying data source to improve performance and avoid loading the entire dataset before filtering and grouping is applied. To determine if a specific step is “folded”, right-click the step, such as Filtered Rows, in the Applied Steps pane and observe if the “View Native Query” is enabled.

Although Microsoft doesn’t have an official list of data sources that support query folding and what steps are folded, typically relational data sources and data sources that can handle the SQL WHERE clause support query folding. Excel and text files do not. It’s important to check query folding because it the data source doesn’t support it, Power BI Desktop currently doesn’t prevent incremental refresh but it will load all the data before the filter is applied. Specifically, the query mashup (M) engine will apply the filter as it reads the rows before it gets loaded into the partitions. You might be able to mitigate this performance issue with non-foldable sources by applying the RangeStart/RangeEnd filter when the initial query is sent to the data source. For example, Dynamics Online supports a $filter clause that will work with incremental refresh:

= OData.Feed(“<endpoint url>/sales?$filter=CreatedDate ge ” & Date.ToText(RangeStart) & ” and CreatedDate lt ” & Date.ToText(RangeEnd)”)

Power BI incremental refresh is an important enhancement to the Power BI data architecture. A Power BI premium feature, incremental refresh brings the following benefits for tackling larger datasets: data loads are faster, data loads are more reliable (no long-running connections), and reduced resource consumption.