Atlanta Microsoft BI Group Meeting on December 4th (Financial Modelling in Microsoft Fabric)

Atlanta BI fans, please join us for the next meeting on Monday, December 4th, at 6:30 PM ET. William Rodriguez will review the key Fabric components whilst walking through a full Financial deployment: data engineering with a tiered medallion architecture, data science with forecasting and other advanced analytics, and data analysis with Power BI. Your humble correspondent will help you catch up on Microsoft BI latest. For more details and sign up, visit our group page.

PLEASE NOTE A CHANGE TO OUR MEETING POLICY. WE HAVE DISCONTINUED ONLINE MEETINGS VIA TEAMS. THIS GROUP MEETS ONLY IN PERSON. WE WON’T RECORD MEETINGS ANYMORE. THEREFORE, AS DURING THE PRE-PANDEMIC TIMES, PLEASE RSVP AND ATTEND IN PERSON IF YOU ARE INTERESTED IN THIS MEETING.

Presentation: Financial Modelling in Microsoft Fabric

Delivery: In-person

Date: December 4

Time: 18:30 – 20:30 ET

Level: Intermediate

Food: TBD

 

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

 

Venue

Improving Office

11675 Rainwater Dr

Suite #100

Alpharetta, GA 30009

Overview: Fabric is a powerful integrated BI platform. This presentation will review they key components whilst walking through a full Financial deployment: data engineering with a tiered medallion architecture, data science with forecasting and other advanced analytics, and data analysis with Power BI.

Speaker: William Rodriguez is the founder and lead consultant of Analytical Ants, an analytics company focused on procuring insights via data engineering, data science, and data analysis. William has led multi-million-dollar construction projects, acted as financial controller managing million-dollar monthly check productions, and built and deployed BI infrastructures at companies large and ‘small’ ($60M Revenue). Academically, he has his master’s in business, two undergraduate business degrees, four active Microsoft certifications, and over 70 accredited SQLBI training hours. William loves spending time with his amazing family and knows that he would be nowhere without Christ.

Sponsor: TBD

PowerBILogo

Atlanta Microsoft BI Group Meeting on November 6th (Exploring Bravo for Power BI)

Atlanta BI fans, please join us for the next meeting on Monday, November 6th, at 6:30 PM ET. Macro Russo will introduce us to the community tool Bravo for Power BI. Your humble correspondent will help you catch up on Microsoft BI latest. I’ll sponsor the event. For more details and sign up, visit our group page.

PLEASE NOTE A CHANGE TO OUR MEETING POLICY. WE HAVE DISCONTINUED ONLINE MEETINGS VIA TEAMS. THIS GROUP MEETS ONLY IN PERSON. WE WON’T RECORD MEETINGS ANYMORE. THEREFORE, AS DURING THE PRE-PANDEMIC TIMES, PLEASE RSVP AND ATTEND IN PERSON IF YOU ARE INTERESTED IN THIS MEETING.

Presentation: Exploring Bravo for Power BI

Delivery: In-person

Date: November 6

Time: 18:30 – 20:30 ET

Level: Beginner

Food: Pizza and drinks will be provided

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Venue

Improving Office

11675 Rainwater Dr

Suite #100

Alpharetta, GA 30009

Overview: Bravo for Power BI is a free external tool to analyze a model, format DAX, export data, and manage dates table and time intelligence calculations. We will explore this tool’s capabilities, with additional insights on the time intelligence calculations feature that can be customized by creating new templates. Some features are also available in more advanced tools like DAX Studio, but the goal in Bravo is to make them available to all Power BI users providing a simpler user interface that is good for the more common use cases.

Speaker: Marco Russo is a business intelligence consultant and mentor. He first started working on Analysis Services in 1998, back when Analysis Services was known as OLAP Services. Marco wrote several books about Power BI, Analysis Service, and Power Pivot with Alberto Ferrari. They also regularly write articles and white papers that are available on sqlbi.com.

Sponsor: Prologika

PowerBILogo

ADF Cross Environment Schema Compare Tool

Almost every BI project requires at least two environments, such as DEV and PROD. Unless I’ve missed something, one Azure Data Factory caveat irrespective of the code promotion path (manual or automated with Azure DevOps or Git CI/CD) is that it uses ARM template deployment where the template contains all the artifacts. The equivalent for database schema deployment would be to script the entire database schema which makes me nervous. Not to mention that ARM templates have their own gotchas, such as the one I discussed here when shared self-hosted integration runtimes are used.

So, I got on a quest to find an ADF schema comparison tool for a quick and dirty way to identify ADF code discrepancies between two environments and promote changes manually. Ideally, the tool would be something like SQL Server Database projects in Visual Studio.

However, I wasn’t able to find such a schema comparison tool for ADF. The closest match was a PowerShell script by Riccardo Pericoto that extracts the ADF artifacts as files. My humble contribution was a few minor changes to the output folders, so that the artifacts from the ADF DEV service are saved in subfolders under a “dev” folder, while the production artifacts are saved under a “prod” folder. I’ve made this change to take advantage of a WinMerge feature that can scan subfolders. I’ve also added a line to delete all files in the target folders.

I enclose the script with the changes in bold:

# Install-Module -Name Az.DataFactory
# Credits: Riccardo Perico https://medium.com/riccardo-perico/comparing-azure-data-factories-exporting-all-the-components-929861ad2e00
Login-AzAccount
Get-AzSubscription -SubscriptionName 'MTX_Int_BI' | Set-AzContext -Name 'MyContextName'
Set-AzContext -Context $(Get-AzContext -Name "MyContextName")
$base_output_url = "C:\temp\adf\"
$pipeline_output = "activities"
$dataset_output = "datasets"
$linked_services_output = "linkedservices"
$triggers_output =  "triggers"

$adfv2_rsg_name_tc1 = 'mtx-rsg-adf-dev'
$adfv2_rsg_name_tc2 = 'mtx-rsg-adf-prod'
$adfv2_name_tc1 = 'mtx-adf-dev'
$adfv2_name_tc2 = 'mtx-adf-prod'

#Delete all files
Get-ChildItem -Path $base_output_url -Include *.* -File -Recurse | foreach { $_.Delete()}
$js = (Get-AzDataFactoryV2Pipeline -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1) | Select-Object "Name"
$js2 = (Get-AzDataFactoryV2Pipeline -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2) | Select-Object "Name"
$js_dst = (Get-AzDataFactoryV2Dataset -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1) | Select-Object "Name"
$js_dst2 = (Get-AzDataFactoryV2Dataset -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2) | Select-Object "Name"
$js_lks = (Get-AzDataFactoryV2LinkedService -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1) | Select-Object "Name"
$js_lks2 = (Get-AzDataFactoryV2LinkedService -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2) | Select-Object "Name"
$js_trg = (Get-AzDataFactoryV2Trigger -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1) | Select-Object "Name"
$js_trg2 = (Get-AzDataFactoryV2Trigger -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2) | Select-Object "Name"

foreach ($pl in $js)
{
    (Get-AzDataFactoryV2Pipeline -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1 -Name $pl.Name).Activities | Sort-Object "Name" | Out-File $base_output_url"dev\"$pipeline_output"\$($pl.Name).txt"
}
foreach ($pl2 in $js2)
{
    (Get-AzDataFactoryV2Pipeline -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2 -Name $pl2.Name).Activities | Sort-Object "Name" | Out-File $base_output_url"prod\"$pipeline_output"\$($pl2.Name).txt"
}
foreach ($d in $js_dst)
{
    (Get-AzDataFactoryV2Dataset -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1 -Name $d.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"dev\"$dataset_output"\$($d.Name).txt"
}
foreach ($d2 in $js_dst2)
{
    (Get-AzDataFactoryV2Dataset -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2 -Name $d2.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"prod\"$dataset_output"\$($d2.Name).txt"
}
foreach ($l in $js_lks)
{
    (Get-AzDataFactoryV2LinkedService -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1 -Name $l.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"dev\"$linked_services_output"\$($l.Name).txt"
}
foreach ($l2 in $js_lks2)
{
    (Get-AzDataFactoryV2LinkedService -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2 -Name $l2.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"prod\"$linked_services_output"\$($l2.Name).txt"
}
foreach ($t in $js_trg)
{
    (Get-AzDataFactoryV2Trigger -ResourceGroupName $adfv2_rsg_name_tc1 -DataFactoryName $adfv2_name_tc1 -Name $t.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"dev\$triggers_output$($t.Name).txt"
}
foreach ($t2 in $js_trg2)
{
    (Get-AzDataFactoryV2Trigger -ResourceGroupName $adfv2_rsg_name_tc2 -DataFactoryName $adfv2_name_tc2 -Name $t2.Name).Properties | Sort-Object "Name" | Out-File $base_output_url"prod\$triggers_output$($t2.Name).txt"
}

Once the files are saved, the next step is to use WinMerge configured to scan subfolders to identify differences:

Then, you can manually copy the JSON payload of the changed artifact and replace its equivalent in the other environment. Long live ADF continuous integration and delivery!

Deactivating ADF Activities

Lo and behold, Azure Data Factory now supports deactivating activities as a preview feature. It took only seven years…

You can deactivate/activate activity by right-clicking it in the canvas or from its General settings. You can also configure how the subsequent activities should treat a deactivated activity, with Succeeded being the most common outcome.

When All You Have is a Hammer… (Dbt Tool for ETL)

Someone asked the other day for my opinion about the open-source dbt tool for ETL. I hadn’t heard about it. Next thing I’ve noticed was that Fabric Warehouse added support for it, so I got inspired to take a first look. Seems like an ELT-oriented tool. Good, I’m a big fan of the ELT pattern whose virtues I extolled many times here. But a Python-based tool that requires writing templates in a dev environment, such as Visual Studio Code? Yuck!

So, what’s my first impression? Same thoughts as when I see developers use another generic programming language, such as C#, for ETL. You can do it but why?

For years, I’ve been trying to get developers out of custom coding for ETL to low-code ETL specialized tools, such as SSIS and ADF.

Just because you studied Python in college, should you use Python for everything? I guess the open-source custom code gravitational pull is too strong. Or there are plenty of masochists out there. One of their case studies hailed dbt for democratizing ETL because everyone knows SQL and can contribute. That’s true but what goes around SQL is also important, as well productivity and maintainability of the overall solution.

On the Azure platform my preferred ETL architecture remains ADF and ELT with SQL in stored procedures. I don’t see any dbt advantages. Dbt might make sense to you if want to stay vendor-neutral, but I’d argue that if you follow the ELT pattern, migrating your ETL processes to another vendor would be trivial.

What am I missing?

UPDATE 10/19/2023

I’m humbled by the interest and comments this blog inspired on LinkedIn. It might well become one of the most popular posts I’ve ever written! This is what I’ve established based on the feedback from people who have used dbt:

  1. Although designed for ELT, it does only the “T” (transformation) part of the ELT process. You’d still need other tools, such as ADF, to extract and load the data. And so in a typical DW project, you could use ADF to extract data from the data sources, stage it, and then execute your dbt process to load the DW tables.
  2. It’s a template-based tool, where you use Python-like syntax to define “models”, such a model corresponding to a fact table, and the SQL statement to load it. Therefore, the SQL statement is embedded inside the template.
  3. It supports features, such as tests, macros, lineage, and documentation.

I wonder how many ETL projects would need these features though. If yours does, then dbt might be of interest. For me, the tool is still a hard pass that shares the same dark corner as ETL automation tools and ETL with custom code. Everybody is trying to do more with less nowadays, so maintainability and productivity are more important. If you have resources and time, my recommendation would be to invest into a home-grown SQL generator that would auto-generate the “T” part, such as the MERGE statement.

Atlanta Microsoft BI Group Meeting on October 2nd (30 Power BI Tips and Tricks)

Atlanta BI fans, please join us for the next meeting on Monday, October 2nd, at 6:30 PM ET.  Michael Hewitt will share 30 tips to make you more productive with everything Power BI. Your humble correspondent will help you catch up on Microsoft BI latest. For more details and sign up, visit our group page.

PLEASE NOTE A CHANGE TO OUR MEETING POLICY. WE HAVE DISCONTINUED ONLINE MEETINGS VIA TEAMS. THIS GROUP MEETS ONLY IN PERSON. WE WON’T RECORD MEETINGS ANYMORE. THEREFORE, AS DURING THE PRE-PANDEMIC TIMES, PLEASE RSVP AND ATTEND IN PERSON IF YOU ARE INTERESTED IN THIS MEETING.

Presentation: 30 Power BI Tips and Tricks

Delivery: In-person

Date: October 2nd

Time: 18:30 – 20:30 ET

Level: Beginner to Intermediate

Food: Sponsor wanted

Agenda:

18:15-18:30 Registration and networking

18:30-19:00 Organizer and sponsor time (events, Power BI latest, sponsor marketing)

19:00-20:15 Main presentation

20:15-20:30 Q&A

Venue

Improving Office

11675 Rainwater Dr

Suite #100

Alpharetta, GA 30009

Overview: This session offers 30 tips and tricks born from years of working with Power BI in the trenches and teaching Power BI to others. We’ll cover many facets of Power BI including DAX, Power Query, visualization, external tools, and more! These tips aim to be practical and useful to Power BI users of all skill levels so that you can take them back to work and level up your Power BI solutions.

Speaker: Michael Hewitt leads the BI and Analytics team at Hunt Brothers Pizza, based in Nashville, TN. He enjoys data visualization and working with business users to help them leverage data more effectively in their daily jobs. In previous roles, Michael has delivered Microsoft based BI solutions in the healthcare, distribution, and manufacturing industries. He has also had the privilege of developing and delivering custom Power BI training for the Department of Defense. Outside of work, Michael is a husband, father of 3 children, is a leader in his local church, co-organizes the Nashville Modern Excel and Power BI user group, and enjoys drumming, woodworking, and hiking.

 

Speaker: TBD

PowerBILogo

Hiking the Precipice Trail

One of trails my wife and I hiked today in the Acadia National Park was the Precipice Trail. This trail is considered one of the hardest trails in the United States. In fact, “hiking” is a misnomer as most of the ascent involves vertical rock climbing and pulling yourself up on iron rangs. Slipping and falling can surely result in a serious injury or death as there are no safety nets and there were such incidents in the past.

However, if you put yourself in the right mindset and prepare, you probably won’t find it that hard. In fact, we’ve found the descent more challenging than the ascent which we thought was exhilarating. And the Gulf of Main views were breathtaking!

How many ascents and views do we miss in life by not taking moderate risks?

Dissolving Partnerships

I’ve been an independent consultant for almost 20 years and through most of this time, my company (Prologika) has been a Microsoft consultancy partner with Gold Data Analytics and Gold Data Platform competencies. Speaking of which, I’d like to thank all the wonderful people who have helped Prologika attain and maintain these competencies over the years by taking rigorous certification exams. I couldn’t have done it without you!

My simple definition of a consultant is an external expert whose services improve the client’s situation. Therefore, the most important criterion for measuring the consultant’s effectiveness is how much “lift” the client gets.

Effectiveness is rarely tangible because it’s difficult to measure. However, in the past, the second component (besides certifications) for maintaining the Microsoft partnership was providing customer references to Microsoft. I don’t think Microsoft ever called the customers but if they did, a simple question would have been “On the scale from to 1 to 10, how do you measure the Prologika’s intervention to improve the objective they were hired to do?” Or, “would you rehire or recommend Prologika?” That’s it, no fancy scores are necessary. Most often, the client is either happy or unhappy, although shades of gray might exist.

Fast forwarding to Microsoft modern partnership requirements and things have changed starting the beginning of this year where Microsoft introduced new criteria for measuring the effectiveness of their partners expressed as a “partner capability score” – “a holistic measurement framework which evaluates the partner on categories of performance, skilling and customer success”. Before we get to it, I find it sad that as a woke company positioned to defend minorities (see the Microsoft’s wokeness report), Microsoft has gone to great lengths to destroy its small-biz partner ecosystem over the years, including:

  1. Favoring large “managed” partners.
  2. Replacing the Microsoft Software Assurance consulting credits, where the client had the discretion how to spend and allocate, with revenue-centric Microsoft-managed funds which are almost impossible to get approved for, such as ECIF and PIE.
  3. Replacing a designated Microsoft Partner Network point of contact with offshore outsourced service.
  4. All the promises given and broken through years. Too many to mention here but it looks to me that every now and then some manager comes which a checklist of all great things Microsoft will do for partners and then all if forgotten or suffers a natural death.
  5. Removing the Silver partner level.

Let’s now look at the new partner capability score for Data & AI, which spans three categories: performance, skilling, and customer success.

  • Performance – the partner must add up to 3 new customers every year. Apparently, Microsoft favors expansion of the customer base (more revenue) as opposed to doing recurring business to existing customers, which to me is by far a much better testimonial for effective services. But wait, there is a fine text explaining that not all customers are equal. “This is the number of unique customer tenants contributing at least $1000 ACR (Azure Consumed Revenue) in one of the last 2 months.” There we go. The emphasis is on selling more aggressively to customers.
  • Skilling – No surprises here. Microsoft wants a certain number of certified individuals. This is the only commonality with the old partner criteria.
  • Customer success – It quickly becomes evident that Microsoft equates “customer success” with revenue. Here we have two paths: number of deployments that depend on advanced Azure services represented in ACR (meaning expensive) and usage growth expressed as ACR growth across the customer base over the past 12 months.

There you have it. Microsoft now views partners as extension to their sales force that are pressured to sell and generate revenue to Microsoft. The more the partner sells, the better partner you are. Therefore, expect increased pressure from Microsoft partners to sell expensive products you probably don’t need, such as Power BI Premium or Fabric in the data analytics space.

The Microsoft VP who came up with this should have been fired on the spot. Gee, with all this focus on revenue, you might think that the partner gets some decent revenue cut, but that’s not the case. It doesn’t matter to me. I’m not making money from selling software and when I recommend a tool it’s always for the client’s best interest and my recommendation is usually very cost effective.

Given that my business values differ from the new Microsoft partnership values, I’m not planning to pursue further competency in the Microsoft partnership ecosystem. I’ll keep the hard earned and now retired Gold partner logo to remind me of what’s important. I’ll continue working with and recommending Microsoft products when it makes sense, such as when they are better from the competition, but without any strings attached.

Excel Connected Tables: Another Opportunity Missed

For years users have been asking for Excel tables (not pivots) connected to SSAS so they can simply browse the data in a tabular format not subjected to the shortcomings of pivots. And for years Excel has been supporting just pivots that natively understand only SSAS OLAP cubes. Is Power BI changing things? Somewhat. Enter refined Excel connected tables. They look great on the surface. You go through a wizard that lets you select and filter your data and you get an Excel table connected to a Power BI dataset. Awesome, right?

Unfortunately, what’s missing is the Field List. To make the feature useful, the user should be able to access the Field List and remove, add, filter, as they can do with pivots. However, once the table is generated, to change it you either must go through the wizard again or change the DAX query (right-click the table, click Table, and then Edit Query). Down the DAX rabbit hole we go again…

As they stand, Excel connected tables are not an interactive browsing tool. Instead, they are geared towards making Power BI datasets accessible in Excel.

Since I lost hope that Excel will discover Tabular and DAX one day, I hope the community will step in to fill in the gap by providing UI like the Pivot Field List so that the user doesn’t need to mess with DAX when all that’s needed is an Excel connected table for ad-hoc data exploration.

SaaS Winners – InTempo Software

I ranted many times about SaaS vendors preventing direct access to data in its native storage, which is typically a relational database. It’s sort of like going to a bank, handing them over your money, and they telling you that you can only get your money back in cash, in Bitcoin, in ₿100 bills but no more than ₿1,000 per withdrawal, or whatever madness suits the bank best. I wonder how long this bank will be in business. Yet, many companies just hand over the keys of the data kingdom to a SaaS vendor and they wonder how to get their “precious” back so they can build EDW and enable modern analytics. Since when direct access became an anathema perpetuated by mega software vendors, including Microsoft? I’ll repeat again that there are no engineering reasons to prevent direct access to your data, even if you must pay more to “isolate” you from other customers for security, performance, liability, or other “tradeoffs” for you choosing the SaaS path.

However, there are rays of sunshine here and there. In a recent integration project, I had to extract data from the InTempo ERP cloud software. Besides other integration options, InTempo supports ODBC access to their IBM Db2 database over VPN connectivity. Not only did InTempo give us direct access to the ERP data, but they also provided support services to help with the SQL queries. I couldn’t believe my luck. Finally, an enlightened vendor! On the downside, we had to send freeform SELECT statements down the wire from ADF while I personally would have preferred an option to package them in SQL views residing at the database, but I’d be pushing my luck.

Direct access to your data should be on top of your SaaS vendor selection criteria. TIP: if you hear “API” in any flavor (REST, ODATA, SOAP, etc.), run for the exit. I have nothing but horrible experience taking that path, even with Microsoft Dynamics APIs. “Data lake” is a tad better but still your best option is direct access to data in its native storage to avoid excessive integration effort and reduce data latency.