Posts

Atlanta MS BI and Power BI Group Meeting on June 3rd

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on June 3, Mondayat 6:30 PM at the Microsoft office in Alpharetta. I’ll show you how to bridge the analytical and developer worlds by integrating Power BI with PowerApps. Eric Flamm will demonstrate SQL Notebooks in Azure Data Studio. Prologika will sponsor the event. For more details, visit our group page and don’t forget to RSVP (fill in the RSVP survey if you’re planning to attend).

Presentation:Bridge Analytics and Developer Worlds with Power Platform
Date:June 3, 2017
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900

Alpharetta, GA 30009

Overview:One of the Power BI most prominent strengths is that it’s a part of the much broader Microsoft Power Platform and Data Platform that spans various on-premises and cloud offerings for data storage, processing, and analysis. Join this session to learn how to integrate Power BI with PowerApps and discover exciting new possibilities that redefine the meaning of a report and let you do more with less. During this entirely hands-on, almost no-slides session I’ll walk you through the steps to implement a popular report requirement for writeback, that Power BI doesn’t natively support.
Speaker:Through his Atlanta-based company Prologika (https://prologika.com), a Microsoft Gold Partner in Data Analytics, Teo Lachev helps organizations make sense of their most valuable asset: their data.  Teo has authored and co-authored several bestselling books on organizational and self-service data analytics (his latest is “Applied DAX with Power BI”), and he has been leading the Atlanta Microsoft BI and Power BI group since he founded it in 2010. Teo has been a Microsoft Most Valued Professional (MVP Data Platform) since 2004.
Sponsor:Prologika is one of the most trusted names in Data Analytics. Our clients, from small businesses to Fortune 100 enterprises, derive tremendous value from our services. Our mission is to help organizations make sense of data by applying the latest technologies for descriptive and predictive analytics and get actionable insights. Your organization will spend less time mining for information and be better equipped to make sound business decisions.
Prototypes with Pizza“SQL Notebooks in Azure Data Studio” by Eric Flamm

092417_1708_AtlantaMSBI1.png

“Applied DAX with Power BI” Book Available

I’m excited to announce my latest book: “Applied DAX with Power BI: From zero to hero with 15-minute lessons“. This book was born out of necessity. In my consulting practice, I had been teaching and implementing Power BI and Analysis Services Tabular, and people were constantly asking for DAX book recommendations. Indeed, DAX is not an easy topic and has its ways to humble even experienced practitioners. There are a few good reference books out there, but they could be somewhat overwhelming for novice users. So, I turned my classroom and consulting experience into this book and designed it as a self-paced guide to help you learn DAX one lesson at a time.

This one was a different book to write. Unlike my other books, which tend to be heavy, this book is smaller (220 pages). Regardless, it took me a while to write it. The book also adopts a lesson-based approach,  so the reader can learn and practice a specific DAX concept quickly. There are usually three sections in each lesson: Practice (with step-by-step guidance), Outcome (a visual that shows the result), and Analysis (explaining why it works this way). Most lessons are five to six pages long, and it should take no more than 15 minutes to complete the lesson’s exercises (you can download and view a sample lesson from the book page).

And if you do one lesson per day, you’ll be a DAX expert in a month!

Currently, the book is available on Amazon (paperback and Kindle eBook) and it’s making its way to other outlets. As always, I’m awaiting feedback on the discussion forum at the bottom of the book page.

051119_2019_AppliedDAXw1.png

Atlanta MS BI and Power BI Group Meeting on May 6th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on May 6, Mondayat 6:30 PM at the Microsoft office in Alpharetta. Patrick LeBlanc, a Principal Program Manager at Microsoft, will show us how to integrate Power BI and Flow for geospatial analytics. I’ll showcase Power BI Report Builder.  TEKSystems will sponsor the event. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on http://atlantabi.pass.org if you’re planning to attend).

Presentation:

Using the Power Platform to Enhance Spatial Granularity

Date:

May 6, 2019, Monday

Time

6:30 – 8:30 PM ET

Place:

Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009

Overview:

Join this session to see how you can use the Power Platform to solve a real world customer problem. With a growing mobile workforce, many organizations are trying to figure out how to properly arm them with the correct tools. In this session we will explain and demonstrate how you can leverage the Power BI Mobile app and Microsoft Flow to build a solution that enables them to quickly identify locations near and far. We will demonstrate how the solution was built step-by-step, detailing how and why certain technologies and solutions were selected or not.

Speaker:

Patrick LeBlanc is a currently a Data Platform Solutions Architect. Along with his 15+ years’ experience in IT he holds a Master of Science degree from Louisiana State University. He is the author and co-author of five SQL Server books and one of the only two guys in the cube. Prior to joining Microsoft, he was awarded Microsoft MVP award for his contributions to the community. Patrick is a regular speaker at many SQL Server Conferences and Community events.

Sponsor:

People are at the heart of every successful business initiative. At TEKsystems, we understand people. Every year we deploy over 80,000 IT professionals at 6,000 client sites across North America, Europe and Asia. Our deep insights into IT human capital management enable us to help our clients achieve their business goals – while optimizing their IT workforce strategies. We provide IT staffing solutions, IT talent management expertise and IT services to help our clients plan, build and run their critical business initiatives!

Prototypes
with pizza

“Power BI Report Builder” with Teo Lachev

092417_1708_AtlantaMSBI1.png

Power BI XMLA Endpoint

If “XMLA” doesn’t ring a bell especially in the context of Power BI, it stands for Extensible Markup Language for Analysis. Still puzzled? It’s the protocol of Analysis Services (Multidimensional and Tabular). So, when an Excel or Power BI sends a query to a cube, it’s encoded according the XMLA specification (an XML-based format). And, the XMLA endpoint is the web service endpoint that Analysis Services listens for upcoming requests.

Now that I cleared the terminology, Microsoft announced the public preview of the XMLA endpoint in Power BI Premium. Since Power BI uses Analysis Services Tabular to scale and hosts the Power BI Desktop models you deploy, this means you can now access that Analysis Services backend instance which wasn’t accessible before. Or, at least read-only for now, meaning that you can only query it and not deploy organizational Tabular models to it.

What it’s in there for you? Here are some scenarios that this management feature enables:

  • Besides Excel, you can connect any client that support Analysis Services, such as Tableau, to your published Power BI datasets if you’re looking for ways to diversify reporting. Chris Finlan mentions this as one scenario worth considering for the near-and-dear to my heart Report Builder (aka Power BI Paginated Report Builder).
  • You can profile your Power BI published datasets by connecting the SQL Server Profiler.
  • You can stress test your Power BI datasets, such as to ensure that they support the expected workload.
  • And my favorite, I can script any Power BI model to see how all these hidden features, such as grouping and binning, are coded.

040719_0037_PowerBIXMLA2.png

Microsoft is probably finding it increasingly difficult to maintain multiple deployments for Analysis Services: SSAS, AAS, and Power BI. Once the Power XMLA endpoint is writable and Power BI dataset sizes increase, I expect them to nudge customers to deploy organizational Tabular models to Power BI Premium, to be on the latest and greatest on a single platform. I hope that it won’t require Power BI Premium too when it goes GA.

Atlanta MS BI and Power BI Group Meeting on April 1st

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on April 1, Monday, at 6:30 PM at the Microsoft office in Alpharetta. Considering all the the momentum behind machine learning, the main presentation will be “Microsoft Citizen Data Science with Power BI” by Stacey Jones, a Principal Data Solution Architect at Microsoft. I’ll do a brief overview of the brand new Modeling tab in Power BI Desktop.  Vacation Express will sponsor the event. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on http://atlantabi.pass.org if you’re planning to attend).

Presentation:Microsoft Citizen Data Science with Power BI
Date:April 1 2019, Monday
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009

 

Overview:Are you curious about Data Science? Are you a Data Scientist? If you answered yes and no to these questions respectively, this session is for you. In this session I will discuss the emerging Citizen Data Scientist role and demonstrate the emerging tools targeting this role, especially the new features in Power BI, such as AutoML
Speaker:Stacey Jones specializes in mentoring and guiding firms in their efforts to build a modern Data, AI & BI governance programs that empower their business with Self-Service BI and Data Science capabilities. He currently serves as the Principal Data Solutions Architect at the Atlanta Microsoft Technology Center (MTC).
Sponsor:Based in Atlanta, Vacation Express is one of the largest tour operators in the southeast United States and has been in business for 30 years. Vacation Express is a member of family-owned and operated, Toronto-based Sunwing Travel Group Inc., Canada’s leading vertically integrated leisure travel company, and has sold well over 2 million vacation packages to its exotic destinations!
Prototypes
with pizza
“The Power BI Desktop Model tab” by Teo Lachev

092417_1708_AtlantaMSBI1.png

First Look at Power BI AutoML

In my previous “First Look at the New Power BI Integration with Azure ML” blog, I showed you how Power BI will make it very easy to integrate it with Azure Machine Learning and Cognitive Services. Data scientists today have plenty of options to create predictive models but all of them have one thing in common: expertise in data science and statistics.

The promise of AutoML is to bring predictive analytics to data analysts just like self-service BI brought analytics to business users.

Let’s see how AutoML works based on what’s in the private preview (the usual disclaimer is that things will probably change). To start with, AutoML requires a dataflow (a note to Microsoft here is that AutoML will become more pervasive if it’s available in Power BI Desktop and it doesn’t require a premium capacity). In the private preview, AutoML requires the following steps. Presumably. the first (and most difficult step), preparing the dataset and cleansing the data is already done and available as a dataflow entity:

030319_1624_FirstLookat1.png

Choosing model type

In private preview, choosing a model type requires the data analysts to select one of four model types:

  • Binary Prediction – a type of a classification task where the model predicts only two (Boolean) states, such as if a customer is a potential buyer or not. It requires a Boolean outcome field.
  • General Classification – a type of a classification task where the model predicts more states, such as if the customer’s risk falls in Low, Medium, or High risk.
  • Regression – Similar to classification, regression predicts a numeric outcome, such as the customer risk on the scale from 0 to 10.
  • Forecasting – Time series forecasting, such as to predict the revenue for future months.

I’d argue here that if want to make this super easy, we shouldn’t be asking a business user about the type of model. Power BI can simply examine the field that needs to be predicted, which happens in the next step, and suggest the model type so the current choices can be collapsed between Classification and Forecasting.

Selecting training data

Next, you need to select the field that needs to be predicted (outcome field) and input columns. This step is the most important and arguably the most difficult even for data scientists because you need to have a business domain knowledge to guess which input fields are the most significant. Modelers tend to lean on the safer side by selecting all fields which dilutes the model accuracy. Here, it will be nice if AutoML suggests which fields are the most significant. AutoML has a validation report that shows you the key influencers after the model is ready, but it would be better if it has similar feature to aid you in the field selection.

The wizard’s next step depends on the mode type. For binary classification, it asks you to specify the field names for two states, such as Buyer and NonBuyer.

Training the model

The model is ready but requires an entity refresh to train it. At this point, AutoML generates two additional entities for providing inputs for testing and training data, such as 80% of the input dataset is used to train the model (generate the predictive patterns) and 20% is used to test the model accuracy against the historical data. The assumption here is probably let the user to fine tune the training and testing data if they want to have more control. I think though that this is too technical, and it should be handled internally by AutoML to reduce the number of entities and overall complexity.

The actual model exists on the Power BI capacity. All the training happens there, and the resulting patterns are stored there. Data does not leave the capacity and the model is not published as an experiment in Azure ML.

Iterating and retraining

One thing is to prepare the model, which is easy with AutoML, but another thing is to make sure that the model is useful and results in a true gain. This is where things get technical but AutoML seeks to simplify this with a validation report. The report has three tabs:

  • Model Performance – The most important information here is the key influencer charts that show you which input fields are the top predictors. You should review this chart and possibly remove input fields that are insignificant. This page also has the only input change you can make to influence the model prediction, which is to specify the probability threshold. For example, for Binary Prediction you can change the probability threshold to 50% to indicate that the customer is considered a buyer if the predicted score is equal or above 50%.
  • Accuracy Report – The most important information on this page is the so called lift chart. You want to verify here that the solid line is above the diagonal line which represents a random guess, i.e. if we don’t use predicative analytics but just randomly guess which customers will become buyers.
  • Technical Details – This page has additional technical information which will be over the head of data analysts but could be useful to pass them on to the data scientist, such as what parameters were used for the model algorithm.

Applying the model

Now that you have the model trained, you can operationalize it by applying it to your original entity, that was used to train the model, or to another entity, such as a list of new customers, as long as it has the same fields that are used as input to the model. Applying the model to an entity adds yet another (computed) entity which has the original fields, plus the predicted. For example, the Binary Prediction model type adds two new fields:

  • Outcome – TRUE if the model predicted that the customer will be a buyer and FALSE otherwise.
  • PredictionScore – The likelihood of the prediction from 0 to 100.
  • PredictionExplanation – Additional details that include the weights of each input field to explain the predicted outcome.

So, once all is done, you’ll have 7 entities:

  • Original entity
  • AutoML entity – trains the model
  • Training data – entity used for training the model
  • Testing data – entity used for testing the model
  • AutoMLScore – a function used for scoring the input dataset
  • AutoMLPollingQuery – a helper entity which I believe returns the identifier of the model

I’d personally prefer less entities. For example, I’d rather see the predicted fields added to the entity to which the model is applied to, as opposed to having another “enriched” entity.

Power BI AutoML brings predictive analytics to data analysts. Powered by ML.NET, AutoML automates the creating of predictive models to handle common predictive tasks, such as classification and forecasting. I hope Microsoft simplifies the process by reducing the number of entities and refreshes required, and by bringing AutoML to Power BI Desktop.

Atlanta MS BI and Power BI Group Meeting on March 4th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on March 4th, Monday, at 6:30 PM at the Microsoft office in Alpharetta. The main presentation will be “Power BI Architecture” by Paco Gonzalez  from SolidQ. Eric Flamm will do us Prototypes with Pizza on configuring a Power BI visual for mobile layout. SolidQ will sponsor the event. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the group page) if you’re planning to attend.

Presentation:Power BI Architecture
Date:March 4, 2019, Monday
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009

Overview:Having the right architecture is key to the adoption of Power BI, and to make the most of your business intelligence solution. This session guides you through different Power BI architectures, such as Direct Query, Import, Connect Live, Composite Models, and Embedded. We will analyze different techniques in data refresh, incremental extraction, integration and collaboration to provide a quick overview of the different approaches available.
Speaker:SolidQ North America CEO and Microsoft Data Platform MVP. Paco Gonzalez has more than 15 years leading teams for high visibility solutions on Data Platform, Business Intelligence, and Machine Learning. He is also a frequent speaker at large and small conferences. SQL Saturdays, TechEd’s, PASS BA, PASS Summit,  DevWeek London, PAW Chicago and London, or Ignite.
Sponsor:The tools and technologies to manage information will constantly evolve, but the need for accurate and actionable information is timeless. SolidQ delivers services for Microsoft platforms that help you architect, integrate and optimize your use of data. The result? Our clients think bigger and move faster because we help them build the capacity and skills to interact with data in creative, collaborative ways that deliver new insights to the business. At SolidQ we measure our success by your satisfaction – and we guarantee it.
Prototypes with pizza“Move your PowerBI Viz to your Phone” by Eric Flamm

092417_1708_AtlantaMSBI1.png

2019 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms

The 2019 Gartner Quadrant for Analytics and BI Platforms is out and Microsoft BI aficionados like your humble correspondent have all the reasons to rejoice. Say whatever you want about Gartner but the quadrant is a great testimonial for Power BI and I think the highest score was overdue. Microsoft still has tons of work on the visualization side of things but I’m glad that Gartner recognized the business value of Power BI which exceeds by far any other BI tool on the market. A few comments on the cautions in the report:

  • Differences in on-premises and cloud service with Azure cloud only – This is about the discrepancies between Power BI and Power BI Report Server which I discussed last year. Revisiting this, I’d like to know how many customers have complained. In my practice, I don’t see much interest in Power BI Report Server for deploying Power BI reports as everyone wants to be in the cloud.
  • Integration of Mode 1 and Mode 2 – This is the same “caution” as last year which is already covered by the first caution so I’m not sure why it’s reiterated. “With Power BI, Microsoft has mainly focused on requirements for Mode 2 (agile, self-service) analytics. On-premises SQL Server Reporting Services meets the needs of Mode 1”. I think Gartner will be happy once the SSRS integration is fully baked in Power BI Premium.
  • Multiple products – Kind of repeated from last year by phrased somewhat differently “Although the core of Power BI is a self-contained product, Microsoft’s roadmap spans multiple products.” So, the same comment. I don’t see a problem with having multiple products if they can integrate. Especially given that some are free, such as PowerApps and Flow with O365. The reason why they are separate products is that they don’t apply to BI only. And which BI vendor has flow, business apps, predictive models, in the same product? Not to mention that other vendors don’t have anything more than just a reporting tool.

Perhaps, the next most difficult thing after getting to the top is remaining on the top. Will Microsoft sustain the Power BI leader position or gradual attrition and shifting priorities will let smaller and more agile vendors dethrone it?

021419_2128_2019Gartner1.png

Atlanta MS BI and Power BI Group Meeting on February 4th

MS BI fans, join us for the next Atlanta MS BI and Power BI Group meeting on February 4th, Monday at 6:30 PM at the Microsoft office in Alpharetta. The main presentation will be “Azure Data Lake for First Time Swimmers” by Samara Soucy  from Innovative Architects. Julies Smith will do us Prototypes with Pizza on “Python Fuzzy Wuzzy for Master Data Identity Mapping”. Innovative Architects will sponsor the event. For more details, visit our group page and don’t forget to RSVP (use the RSVP survey on the group page) if you’re planning to attend.

Presentation:Azure Data Lake for First Time Swimmers
Date:February 4, 2019, Monday
Time6:30 – 8:30 PM ET
Place:Microsoft Office (Alpharetta)

8000 Avalon Boulevard Suite 900
Alpharetta, GA 30009

Overview:Data Lake may be the hot new way to store data for reporting, but it can be scary if you’ve never dealt with one. If you have no idea what a data lake is, have no fear! You will get an intro to the concept of data lake and what scenarios you should be considering it as your platform. From there you’ll get to see how Azure Data Lake Analytics and U-SQL allow developers and data professionals to create compelling analytics with data that previously was difficult to work with.
Speaker:Samara is a Microsoft Certified Specialist in C#. Working in both web and data, she has experience in a wide variety of technologies including JS, .Net, SQL, Analysis Services, and Power BI. Working with Innovative Architects as a Software Develoment Consultant, she uses that knowledge to plug into whatever part of their system needs additional expertise.
Sponsor:Innovative Architects specializes in solving business problems around business intelligence, data management, e-commerce, portals & collaboration, custom application engineering and business process and integration.  https://innovativearchitects.com/
Prototypes with pizza“Python Fuzzy Wuzzy for Master Data Identity Mapping” by Julie Smith

092417_1708_AtlantaMSBI1.png

T-SQL, Power Query, or DAX?

I taught my Applied Power BI class last week to a group of smart data analysts. All of them were knowledgeable of T-SQL, which they have been using extensively for years to shape and transform the data and produce SSRS, Qlik and now Power BI reports. They haven’t previously used Power Query, but they liked my overview of its features. However, they were rightfully confused when to use T-SQL, Power Query, or DAX. Starting with the data source and moving up the chain, Power BI lets you use expressions in:

  1. Data source (custom SQL Query, SQL view)
  2. Power Query
  3. DAX calculated columns
  4. DAX measures (the uppermost level)

My advice is to invest time and learn all these capabilities. When you have diverse skills, you can use the best tool for the task at hand.

Although is hard to generalize as every data transformation is different, I recommend you shape the data as upstream (closer to the data source) as possible. So, if you are familiar with T-SQL and that’s your primary data source, then use T-SQL. As one of the Microsoft best products, SQL Server has been enjoying 30 years of continuous improvements. Not only can you apply the skills you already have, but you will gain in performance when you use T-SQL and delegate data crunching to SQL Server which is what it’s designed to do. You can also benefit from the rich data manipulation features of T-SQL. These are the same reasons why I favor the ELT (Extract, Load and Transform) pattern in organizational BI solutions instead of SSIS data flow transforms. And if you find T-SQL lacking in features, Power Query can supplement it nicely, such as to fill down missing values, quickly unpivot data, or apply fuzzy lookup.

When you connect to data sources that don’t support SQL, such as flat files or Excel, the next natural place is Power Query to shape and transform the data. The choice between Power Query and DAX calculated columns is a tricky one and typically involves a compromise between performance and skill set. In many cases, you’ll find that a custom column can be implemented both in Power Query and DAX calculated columns. In general, if performance is OK, use Power Query, which is further upstream than DAX calculated columns. As a bonus, your data model will see the custom columns as regular columns and compress them equally well. Consider DAX calculated columns (one level up Power Query) when:

  • You must use DAX features that Power Query lacks, such as ranking.
  • When Power Query transforms lead to long data refresh times, such as a lookup between two large tables.

Finally, while custom columns can be often implemented in any of the first three layers, DAX measures are unique, and they typically can be implemented in DAX only. For example, if you need the expression to reflect the end user filter selection, you must use a DAX measure because only DAX measures are evaluated at run time and can access runtime conditions, such as values from filters and slicers.