Prologika Newsletter Summer 2017

What Does Power BI Premium Mean for You?


060417_1725_PrologikaNe1.pngAs I mentioned in my Spring newsletter “15 Reasons Why Microsoft is Leading the BI Space”, Microsoft BI is enjoying resurgent interest from customers and industry observers, and Power BI gets lots of attention. I’m sure that you’ve also heard the announcements about Power BI Premium – a new dedicated licensing model for Power BI. In fact, Power BI Premium is so important to Microsoft that it’s positioned as a new product under the Power BI umbrella name instead of a new licensing model. In this newsletter, I’ll discuss what Power BI means for you and why you should strongly consider it. If you live in these cities, consider joining me at SQL Saturday Chattanooga on June 24th or the Atlanta MS BI Group on June 26th when I’ll present “Understanding Power BI Premium” and discuss its inner workings in more detail. My June calendar has the details about these events.


Why Power BI Premium?

According to Microsoft, for less than 2 years since Power BI became generally available, more than 5 million users are currently using it. As the product matures, the following growing pain points have emerged:

  • Per-user licensing model – Large organizations find it difficult to license Power BI per user. A case in point – one year after a successful Power BI hybrid pilot, a Fortune 100 organization has purchased whopping 5 Power BI Pro licenses. There are several reasons for the slow adoption by large companies but one of them is the per-user license.
  • Cost prohibitive with large user base – The per-user licensing model doesn’t include a “reader” license. If a report has a Power BI Pro features, all users accessing reports would need Power BI Pro licenses. So, if you a report that used Power BI Pro features, such as gateways or live connections, and you won’t this report to be available to 1,000 users, you had to foot $10,000/month bill because everyone required Power BI Pro.
  • Performance – Power BI Service is a shared cloud environment. Although no one has shared performance issues with me, it’s possible that the performance of your reports might be negatively affected by other organizations.
  • No “premium” features – When I teach Power BI I often get feedback that some missing features, such as caching and incremental refresh, are important. This goes back to performance but the reason why they are not currently available is because they might impact performance in a shared environment.
  • Confusing licensing – Previously, there was some overlap between Power BI Free and Power BI Pro. Many organizations find it very difficult to navigate what features are available where.

What’s There to Like in Power BI Premium?

There are several reasons why Power BI might be appealing to your organization.

To start with, larger organizations can save on licensing cost. If all users would need access to reports with Pro features, the breakpoint at which Power BI Premium reduces cost is above 500 users. That’s because, Power BI Premium allows Power BI Free users to access Pro content without Power BI Pro licenses. You can use the Power BI Premium calculator to experiment with what-if scenarios and calculate your licensing cost.

Cost savings is not the only reason to consider Power BI Premium. You also get dedicated capacity. When you purchase Power BI Premium, you buy a dedicated environment for your organization. Think of it as buying a dedicated IaaS service, such as an Azure VM, versus a PaaS service, such as an Azure SQL Database, that is shared with other users on the same server. For more information about how this works and what Premium plans are available, read the Power BI Premium whitepaper.

Almost organization I’ve consulted with has the need to embed reports for a third party. Previously, if you want to embed Power BI reports, you had to use the Azure Power BI Embedded Service, which was separate from the Power BI Service. Consequently, Power BI Embedded had to catch up with Power BI Service. For example, it still doesn’t have connectivity to on-premises data sources. The good news is that Power BI Embedded marries Power BI Service (Power BI Premium = Power BI Service + Power BI Embedded) so there will be feature parity. If you are concerned about the increased licensing cost (Power BI Embedded has an hourly-based session licensing model), Microsoft unveiled low-cost Power BI Premium plans targeted to report embedding only and starting at $625/month.

powerbipremiumplans

Another appealing detail concerning embedded analytics is that unlike Power BI Premium which requires reports authors to have Power BI Pro licenses, creating new embedded reports or changing existing embedded reports by external users don’t require a Power BI Pro license. Of course, if your organization purchases a Power BI Premium plan, such as P1 starting at $5,000/month, not only will you be able to use Power BI Premium for internal use, but you can also embed reports for external customers. In other words, one Power BI Premium license covers both internal and external reporting.

Finally, Power BI Premium-only features are in the works, such as real-time caching by pinning datasets to memory, incremental dataset refresh, secondary model copies for disaster recovery, and geo replicas to reduce latency by replicating instances around the world.

Changes to the Power BI Portfolio

You might wonder how Power BI Premium impacts your current Power BI usage. The table below shows how Power BI Premium changes the Power BI portfolio.

TodayFuture
Power BI DesktopConnect to 70+ data sources

Data transformations

Report creation and exploration

No changes
Power BI FreeNo live connections, No gateway connectivity

Smaller capacity limits and data refresh rates

Only simple dashboard sharing

Power BI Pro feature parity
No sharing (not even simple dashboard sharing)
Power BI ProAccess to all data sources

Larger capacity limits and data refresh rates

All sharing options (simple, workspaces, org content packs)

No changes
Power BI PremiumIncreased capacity limits
Dedicated environment
Content distribution (reader license)
Power BI Report Server
More features in future (read the whitepaper)

So, don’t panic! You can continue using Power BI under the current per-user licensing model (Power BI Free and Power BI Pro) if Power BI Premium makes no sense to you. Another good news is that Power BI Free now has all the Power BI Pro features, so hopefully we won’t see “This report requires Power BI Pro license” anymore. The bad news is that with per-user licensing, Power BI Free is for individual use only. A Power BI Free user can’t share any Power BI content (not even simple dashboard sharing). And, unless you purchase Power BI Premium, content can’t be shared with Power BI Free users (Both inbound and outbound sharing are not available for Power BI Free users).

Power BI Report Server

Delivering on its on-premises reporting roadmap, Microsoft has extended SQL Server Reporting Services (SSRS) to support Power BI reports. The extended SSRS will be known as Power BI Report Server. Moreover, Microsoft has decoupled SSRS from SQL Server so it gets more frequent updates. Moving forward, SSRS becomes two products:

  • SSRS – This is the SSRS we know it but with no Power BI integration. It will get new RDL features but no Power BI integration features.
  • Power BI Report Server – It gets both RDL and Power BI features. As far as the reason for the name change, the Power BI name is a strong brand while SSRS has been associated with the old style paginated reports.

You can get Power BI Report Server in two ways:

  • As a part of the Power BI Premium plan where you get the same number of licensed on-premises cores as the v-cores you purchased within your Power BI Premium plan. You can use these licensed cores only for Power BI Premium Report Server (SQL Server requires a separate license if you install it on the same box as Power BI Premium Report Server).
  • Standalone and covered by a SQL Server Enterprise Edition with Software Assurance license. Currently, SQL Server doesn’t check for Software Assurance in any way (there isn’t such SKU). So, Power BI Premium licensing would be an honor system for customers who want to get it standalone (with SQL Server Enterprise Edition licenses).

So, although Power BI Premium Report Server has divorced SQL Server, you can still cover by the SQL Server license (kind of when you send your kid to college but she still lives with you). In both cases, developers publishing Power BI reports to the Power BI Report Server will require a Power BI Pro license at $9.99/user/mo.

MS BI Events in Atlanta

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service. Regards,

Teo Lachev

Teo Lachev President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Spring 2017

15 Reasons Why Microsoft is Leading the BI Space


I hope you’re enjoying the beginning of 2017 and you’re staying healthy and busy. It’s been a hectic and exciting period for me. I published the second edition of my “Applied Microsoft Power BI” book in January to keep it up to date.  It’s great to see all the momentum and interest surrounding Power BI. Just last month, I taught my “Applied Power BI” class three few weeks in a row! On the implementation side of things, companies of all sizes are adopting Power BI. And, one of our customers made the Power BI blog with their CRM platform that uses Power BI Embedded to deliver interactive reports to their users!


The Gartner 2017 Magic Quadrant for BI and Data Analytics

Speaking of Power BA and its momentum, industry observers are taking notice. As Kamal Hathi (General Manager, Microsoft BI) announced, the newly released Garner Magic Quadrant for Business Intelligence and Data Analytics gave Microsoft a very high score. Read the full report here.

Of course, there are many competing definitions of what constitutes a BI and Analytics platform. Note that Gartner reviewed only Power BI and certain aspects of Cortana Analytics. Reporting Services, Analysis Services and the other on-premises and cloud Microsoft BI-related solutions are not covered, such as Azure SQL Database, SQL Data Warehouse, Azure ML, Query Catalog, Cognitive Services, HDInsight,  Data Lake, StreamInsight). If we take them in consideration, where will the dot be?

The image below shows the Microsoft’s lift between last year and this year in the Gartner magic quadrant. What a leap!

 031117_2001_PrologikaNe2.png 031117_2001_PrologikaNe3.png

I’m surprised to find that the dot is not higher on the ability to execute, especially considering that Gartner ranked Microsoft and its main rival (Tableau) almost the same. You might also find my two-part blog about Tableau vs. Microsoft useful if you are tasked to compare these two vendors.

Understanding the 15 Critical Capabilities

As explained in the report, to evaluate vendors, Gartner uses 15 critical capabilities to support five main use cases: Agile Centralize BI Provisioning, Decentralized Analytics. Governed Data Discovery, Embedded BI, and Extranet Deployments. The following table includes my comments on each of the 15 capability.

CapabilityMy Rank Teo’s Comments
InfrastructureBI Platform Administration, Security and Architecture
Capabilities that enable platform security, administering users, auditing platform access and utilization, optimizing performance and ensuring high availability and disaster recovery.

Medium to High

There are a lot of things here:
Security – More work is required to support external users in Power BI, Power BI Embedded, and SSRS
Administering users – Power BI supports a progressive deployment depending on the desired level of integration with Office 365 (unmanaged tenant, managing tenant, and federated AD)
Auditing: Power BI supports auditing access and utilization but more work is required for proactive monitoring and improving its data governance capabilities
Performance and HA: Highly-redundant and scalable backend, consisting of SSAS Tabular farm and Azure services.
Cloud BI
Platform-as-a-service and analytic-application-as-a-service capabilities for building, deploying and managing analytics and analytic applications in the cloud, based on data both in the cloud and on-premises

High

Power BI supports both pure cloud and hybrid architectures depending on data volumes and security requirements. No other vendor offers both architectures.
Data Source Connectivity and Ingestion
Capabilities that allow users to connect to the structured and unstructured data contained within various types of storage platforms, both on-premises and in the cloud.

High

As of this time, Power BI supports close to 70 connectors to let you connect to cloud and on-premises data sources. No scripting required.
Data ManagementMetadata Management
Tools for enabling users to leverage a common SOR semantic model and metadata. These should provide a robust and centralized way for administrators to search, capture, store, reuse and publish metadata objects such as dimensions, hierarchies, measures, performance metrics/key performance indicators (KPIs), and report layout objects, parameters and so on. Administrators should have the ability to promote a business-user-defined data mashup and metadata to the SOR metadata.

Medium to High

Microsoft offers an organizational BI Semantic Model (BISM) which is typically implemented as an Analysis Services Tabular or Multidimensional models. On the self-service side of things, users can create Excel or Power BI Desktop models. IT can promote a self-service Excel model to a Tabular model. Although not officially supported because Power BI Desktop is changing every month, IT can upgrade a Power BI Desktop model to a Tabular model.
Self-Contained Extraction, Transformation and Loading (ETL) and Data Storage
Platform capabilities for accessing, integrating, transforming and loading data into a self-contained storage layer, with the ability to index data and manage data loads and refresh scheduling.

Medium

SSIS is the most popular on-premises ETL tool. More work is required to bring similar capabilities in the cloud (I think Azure Data Factory is a step backwards)
Self-Service Data Preparation
The drag-and-drop, user-driven data combination of different sources, and the creation of analytic models such as user-defined measures, sets, groups and hierarchies. Advanced capabilities include semantic autodiscovery, intelligent joins, intelligent profiling, hierarchy generation, data lineage and data blending on varied data sources, including multistructured data

High

Power BI Desktop and Excel has a fantastic query editor (originated from Power Query) that scores big with business users. Other vendors that score high in the Magic Quadrant don’t have such native capabilities. Power BI and Excel have best of class self-modeling capabilities (much better than Tableau). Azure Query Catalog can be used for dataset autodiscovery.
Analysis and Content CreationEmbedded Advanced Analytics
Enables users to easily access advanced analytics capabilities that are self-contained within the platform itself or available through the import and integration of externally developed models.

High

Not sure what is meant here by “advanced analytics capabilities”. Power BI supports Quick Insights (brute-force machine learning), integration with R, Azure Machine Learning, clustering, forecasting, binning, but I might be missing something.
Analytic Dashboards
The ability to create highly interactive dashboards and content, with visual exploration and embedded advanced and geospatial analytics, to be consumed by others.

High

“Highly interactive dashboards and content” is what Power BI is all about.
Interactive Visual Exploration
Enables the exploration of data via an array of visualization options that go beyond those of basic pie, bar and line charts to include heat and tree maps, geographic maps, scatter plots and other special-purpose visuals. These tools enable users to analyze and manipulate the data by interacting directly with a visual representation of it to display as percentages, bins and groups.

Medium

Power BI covers all stated requirements in this category. However, while its visualization capabilities are improving every week, more work is required to catch up with the Tableau and Qlik in some areas, such as cell drillthrough, ability to repeat visualizations, such as to nest a bar into a table, conditional binding, exression-based properties and filters, and others.
Smart Data Discovery
Automatically finds, visualizes and narrates important findings such as correlations, exceptions, clusters, links and predictions in data that are relevant to users without requiring them to build models or write algorithms. Users explore data via visualizations, natural-language-generated narration, search and NLQ technologies.

Medium to High

Through Quick Insights, Power BI can automatically find correlations, outliers, and other hidden trends. It doesn’t currently have native narrative capabilities, except for the custom visual from Narrative Science. Power BI is one of the few vendors that offers natural queries.
Mobile Exploration and Authoring
Enables organizations to develop and deliver content to mobile devices in a publishing and/or interactive mode, and takes advantage of mobile devices’ native capabilities, such as touchscreen, camera, location awareness and natural-language query

High

Native apps for iOS, Android and Windows to surface both Power BI and SSRS reports.
Sharing of FindingsEmbedding Analytic Content
Capabilities including a software developer’s kit with APIs and support for open standards for creating and modifying analytic content, visualizations and applications, embedding them into a business process, and/or an application or portal. These capabilities can reside outside the application (reusing the analytic infrastructure), but must be easily and seamlessly accessible from inside the application without forcing users to switch between systems. The capabilities for integrating BI and analytics with the application architecture will enable users to choose where in the business process the analytics should be embedded.

High

An Azure cloud service, Power BI Embedded allows you to embed reports for external users with very appealing cost-effective licensing model. On premises, SSRS offers similar capabilities.
Publish, Share and Collaborate on Analytic Content
Capabilities that allow users to publish, deploy and operationalize analytic content through various output types and distribution methods, with support for content search, scheduling and alerts. Enables users to share, discuss and track information, analysis, analytic content and decisions via discussion threads, chat and annotations.

High

Power BI supports subscriptions and data alerts but we can do better, such as to allow an admin to subscribe other users. Power BI workspaces and groups allow users to share BI artifacts, participate in discussion lists, and share calendars. Power BI Mobile allows users to annotate reports.
Overall platform capabilities Platform Capabilities and Workflow
This capability considers the degree to which capabilities are offered in a single, seamless product or across multiple products with little integration.

Medium to High

The integration hub is Power BI Service which integrates Power BI reports, Excel reports, SSRS reports. More work is required on integration with Excel reports that connect to SSAS models.
Ease of Use and Visual Appeal
Ease of use to administer and deploy the platform, create content, consume and interact with content, as well as the visual appeal.

High

As far as deployment, Power BI is “five seconds to sign up, five minutes to wow!” As far as the visual appeal, beauty is the eye of the beholder but Power BI has done respectable job to follow best visualization practices.

MS BI Events in Atlanta

  • Atlanta MS BI and Power BI Group: “Comparing Power BI to Tableau” presentation by Slalom Consulting on 3/27
  • Atlanta MS BI and Power BI Group: “Using Power BI with Visual Studio Team Services” on 4/24.
As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

sig-1

Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Winter 2016

Designing an Operational Data Store (ODS)


odsI hope you’re enjoying the holiday season. I wish you all the best in 2017! The subject of this newsletter came from a Planning and Strategy assessment for a large organization. Before I get to it and speaking of planning, don’t forget to use your Microsoft planning days as they will expire at the end of your fiscal year. This is free money that Microsoft gives you to engage Microsoft Gold partners, such as Prologika, to help you plan your SQL Server and BI initiatives. Learn how the process works here.


Just like a data warehouse, Operational Data Store (ODS) can mean different things for different people. Do you remember the time when ODS and DW were conflicting methodologies and each one claimed to be superior than the other? Since then the scholars buried the hatchet and reached a consensus that you need both. I agree.

To me, ODS is nothing more than a staging database on steroids that sits between the source systems and DW in the BI architectural stack.

What’s Operational Data Store?

According to Wikipedia “an operational data store (or “ODS”) is a database designed to integrate data from multiple sources for additional operations on the data…The general purpose of an ODS is to integrate data from disparate source systems in a single structure, using data integration technologies like data virtualization, data federation, or extract, transform, and load. This will allow operational access to the data for operational reporting, master data or reference data management. An ODS is not a replacement or substitute for a data warehouse but in turn could become a source.”

OK, this is a good starting point. See also the “Operational Data Source (ODS) Defined” blog by James Serra. But how do you design an ODS? In general, I’ve seen two implementation patterns but the design approach you take would really depends on how you plan to use the data in the ODS and what downstream systems would need that data.

One to One Pull

ODS is typically implemented as 1:1 data pull from the source systems, where ETL stages all source tables required for operational reporting and downstream systems, such loading the data warehouse. ETL typically runs daily but it could run more often to meet low-latency reporting needs.  The ETL process is typically just Extract and Load (it doesn’t do any transformations), except for keeping a history of changes (more on this in a moment). This results in a highly normalized schema that’s the same is the original source schema. Then when data is loaded in DW, it’s denormalized to conform to the star schema. Let’s summarize the pros and cons of the One:one Data Pull design pattern.

 ProsCons
Table schemaHighly normalized and identical to the source systemThe number of tables increase
Operational reportingUsers can query the source data as it’s stored in the original source. This offloads reporting from the source systemsNo consolidated reporting if multiple source systems process same information, e.g. multiple systems to process claims
Changes to source schemaSource schema is preservedAdditional ETL is required to transform to star schema
ETLExtraction and load from source systems (no transformations)As source systems change, ETL needs to change

Common Format

This design is preferred when the same business data is sourced from multiple source systems, such as when the source systems might change or be replaced over time. For example, an insurance company might have several systems to process claims. Instead of ending up with three sets of tables (one for each source system), the ODS schema is standardized and the feeds from the source systems are loaded into a shared table. For example, a common Claim table stores claim “feeds” from the three systems. As long as the source endpoint (table, view, or stored procedure) returns the data according to an agreed “contract” for the feed, ODS is abstracted from source system changes. This design is much less normalized. In fact, for the most part it should mimic the DW schema so that DW tables can piggy back on the ODS tables with no or minimum ETL.

 ProsCons
Table schemaDenormalized and suitable for reportingThe original schema is lost
Operational reportingRelevant information is consolidated and stored in one tableSchema is denormalized and reports might not reflect how the data is stored in the source systems
Schema changes to source systemsAs long as the source endpoints adhere to the contract, ODS is abstracted from schema changesA design contract needs to be prepared and sources systems need to provide the data in the agreed format
ETLLess, or even no ETL to transform data from ODS to DWETL needs to follow the contract specification so upfront design effort is required

Further Recommendations

Despite which design pattern you choose, here are some additional recommendations to take the most of your ODS:

  • Store data at its most atomic level – No aggregations and summaries. Your DW would need the data at its lowest level anyway.
  • Keep all the historical data or as long as required by your retention policy – This is great for auditing and allows you to reload the DW from ODS since it’s unlikely that source systems will keep historical data.
  • Apply minimum to no ETL transformations in ODS – You would want the staged data to keep the same parity with the source data so that you can apply data quality and auditing checks.
  • Avoid business calculations in ODS – Business calculations, such as YTD, QTD, variances, etc., have no place in ODS. They should be defined in the semantic layer, e.g. Analysis Services model. If you attempt to do so in ODS, it will surely impact performance, forcing to you to pre-aggregate data. The only permissible type of reporting in ODS is operational reporting, such as to produce the same reports as the original systems (without giving users access to the source) or to validate that the DW results match the source systems.
  • Maintain column changes to most columns – I kept the best for last. Treat most columns as Type 2 so that you now when a given row was changed in the source. This is great for auditing.

Here is a hypothetical Policy table that keeps Type 2 changes. In this example, the policy rate has changed on 5/8/2010. If you follow this design, you don’t have to maintain Type 2 in your DW (if you follow the Common Format pattern) and you don’t have to pick which columns are Type 2 (all of them are). It might be extreme but it’s good for auditing. Tip: use SQL Server 2016 temporal tables to simplify Type 2 date tracking.

RowStartDateRowEndDateSourceIDRowIsCurrentRowIsDeletedETLExecutionIDPolicyKeyPolicyIDPremiumRate
5/2/20105/8/20101000BB76521-AA63-…14969610.45
5/9/201012/31/9999110CD348258-42ED-..24969610.50

MS BI Events in Atlanta

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

sig-1

Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Fall 2016

Embedding Reports in Custom Applications


embedreportLast week’s seminar on formulating a Power BI enterprise strategy held at the Microsoft office was a great success. Over 50 people witnessed the amazing capabilities of the Power BI platform. As Power BI evolves, we’ll have similar events to bring you up to date. If you couldn’t attend, you can find the slides here. Or, are you looking for more in-depth Power BI training? There is still time to register for my 2-day public Power BI workshop on Sep 14-15 at the Microsoft Office in Atlanta. Reserve your seat today to attend this exclusive event for only $999 and learn practical Power BI knowledge and data analytics skills that you can immediately apply to your job.


Speaking of Power BI, do you need to embed reports in custom apps in order to bring data analytics to your customers? If so, look no further than Power BI Embedded, which Prologika has been using to help ISVs increase the value of their apps by bringing instant insights to their customers. Most applications need some reporting capabilities. Report-enabling custom applications has been traditionally challenging with the Microsoft BI platform. True, Visual Studio includes Windows Forms and ASP.NET ReportViewer controls that make it very easy to embed Reporting Services reports (also known as paginated reports). However, the chances are that you might prefer more interactive reports that can be viewed with any browser and on any device. This is where Power BI Embedded comes in. You have the app. You have the data. Now bring data to life inside your app with Power BI Embedded!

Power BI Embedded

What’s Power BI Embedded?

Power BI Embedded is an Azure cloud service that help developers embed Power BI interactive reports in custom apps for a third party. Notice that I said “third party”. The current licensing model prevents you for using Power BI Embedded to distribute reports inside your organization or re-implement functionality that already exists in Power BI. Fair enough – internal users are already covered by Power BI licenses and Microsoft doesn’t want you to come up with a tool that competes with Power BI. Speaking about licensing, what’s great about the Power BI Embedded pricing model is the recent change that Microsoft made where you’re charged per a report session and not for the number of reports that are rendered or registered customers! When the user opens a report, a session is started for one hour or until the user closes the app. The first 100 sessions/mo are on Microsoft. After that you’re charged $5 for 100 sessions per month. Besides the Power BI Embedded great features, this cost-effective pricing model is why independent software vendors (ISVs) and developers are flocking to Power BI Embedded.

How are users provisioned?

If you recall, Power BI requires each user to sign up. Power BI Pro charges the user $9.99 per month (based on feedback from customers, they don’t pay the sticker price because they either acquire Power BI through the Office 365 E5 plan and/or have a discounted price). If you have an app that potentially can be accessed by thousands of users, the Power BI pricing model is not cost effective. By contrast, Power BI Embedded leaves it up to the custom app to authenticate the user. So there is no provisioning you have do. However, although Power BI Embedded uses the Power BI infrastructure, don’t expect the Power BI reports to show up when you log in to Power BI. In other words, Power BI Embedded and Power BI don’t share datasets and reports. Consequently, at the least for now, Power BI Embedded is limited to embedding reports only. No Q&A, no quick insights, no portal, and no other Power BI goodies. Another limitation is that at least for now Power BI Embedded is limited to report viewing only. Users can’t edit the reports, such as to add or remove fields.

What data sources are supported?

One what to acquire data is to import your data in Power BI Desktop. When you import data, you’re limited of 1 GB of compressed data, which is the same dataset limitation that Power BI currently has. Because of the excellent compression (x5-10 ration), you can still pack a lot of data into a 1 GB dataset. For example, you can create a separate extract for each customer if you have a limited number of customers or you need to provide a high degree of customization for each customer. Another option is to connect live to cloud data sources. Because Power BI Embedded is an Azure cloud service, naturally it works with Azure-resident PaaS data sources, such as Azure SQL Database and Azure SQL Data Warehouse.

How does security work?

Power BI Embedded uses OAuth to authorize your users with Power BI. Your application authenticates the user as it would normally do. If the user has access to reports, your application would login to the Power BI Embedded using a special access key (think of it as a password). When the user requests the report, the application generates a token that Power BI validates to grant access to the report. If you need to support row-level security (RLS), Power BI Embedded get you covered too! You can define your security roles and row filters in Power BI Desktop. Then, your application can pass the user login (it doesn’t have to be on your domain) and what role(s) the user belongs to. The net result is that the user can see only the data the user is authorized to see.

How customizable is Power BI Embedded?

The Power BI Embedded customization story got even better with the introduction of the JavaScript APIs. They allow you to programmatically access the report object model on the client, such as to react to page change events, pass filters, or hide the report filter pane. For example, in the screenshot above, the application has its own page navigation and filter pane that replaces the Microsoft-provided filter pane. This gives the application more control about validating and passing visual, page, or report-level filters.

How do I get started?

Microsoft has provided an excellent ASP MVC sample to get you started. Check also my “Configuring Power BI Embedded” blog for a better configuration experience

MS BI Events in Atlanta

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

sig-1

Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Summer 2016

Why Choose Power BI as BI Platform


061916_1807_PrologikaNe2.pngAre you deciding which BI platform to adopt? With so many vendors and products, you are not alone and the choice is far from easy. For the past few months, I’ve consulted with and mentored several large publicly traded companies to help them understand the benefits of Power BI and teach their staff on how to get the most out of it. Speaking of Power BI and large organizations, Prologika added a new Power BI case study to the Power BI partner showcase that demonstrates why the world’s largest logistic company decided to standardize on a single platform and adopt Power BI. Last but not least, if you are in Atlanta, visit our Atlanta MS BI group which is now the only official local Power BI group. So, if you’re serious about Power BI, check our monthly meetings which now feature plenty of Power BI content.


In this newsletter, I’ll enumerate the most important advantages of Power BI that make it stand out from the rest of the competing platforms. Also, I’ll mention some areas my customers have identified where Power BI has left a room for improvement.

Overall

In this section, I’ll review some general implementation and cost considerations that in my opinion make Power BI a very compelling choice.

  1. Data Platform – No matter how good it is, a self-service visualization tool addresses only a small subset of data analytics needs. By contrast, Power BI is a part of the Microsoft Data Platform that allows you to implement versatile solutions and use Power BI as a presentation layer. Want to implement a real-time dashboard from data streams? Azure Stream Analytics and IoT integrates Power BI. What to show reports on the desktop from natural questions? Cortana lets you do it by typing questions or voice. Want to implement smart reports with predicted results? Power BI can integrate with Azure Machine Learning? Want to publish SSRS and Excel reports? Power BI supports this. Expect this strength to increase as Cortana Analytics Suite and prepackaged solutions evolve.
  2. Cloud First – I know that many of you might disagree here as on-premises data analytics is currently more common, but I see the cloud nature of Power BI as an advantage because allows Microsoft to push out new features much faster than the typical 2-year major release cadence of other vendors. Recall that Power BI Service is updated on a weekly basis while Power BI Desktop is on a monthly release cadence. And because Power BI is a cloud service, it supports the versatile integration scenarios I mentioned before.
  3. Cost – It’s hard to compete with a freemium pricing model. Implementing a BI solution has been traditionally cost prohibitive. However, now Power BI Desktop is free, Power BI Mobile apps are free, Power BI Service is mostly free. If you need the Power BI Pro features, Power BI is packaged with the Office 365 E5 plan, it has an enterprise license, and I’ve heard customers get further discounts from Microsoft.

Next, I’ll review specific Power BI strengths for different user types.

Business Users

By “business users”, I’ll mean information workers that don’t have the necessary skills or desire to create data models.

  1. Content packs and Get Data – Basic data analytics needs can be met in Power BI without modeling. For example, if the user is interested in analyzing data from Salesforce, the user can use the Salesforce content pack and get predefined reports and dashboards. Further, the user can create their own reports from the dataset included in the content pack. What if your cloud data sources have a lot of data and content packs are impractical? Microsoft is rolling out pre-configured scalable solutions (currently, Salesforce and Dynamics CRM).
  2. Productivity features – Power BI has several features that resonate very well with business users. Q&A allows users to ask natural questions, such as “sales last year by country”. Power BI interprets the question and shows the most suitable visualization which the user can change if needed. Within 20 seconds, Quick Insights applies machine learning algorithms that help business users perform root cause analysis and to get insights that aren’t easily discernible by slicing and dicing, such as to find why profit is significantly lower in December. Such productivity features are missing in competing products.

Data Analysts

Data analysts (power users) are the primary audience for self-service BI. Power BI excels in the following areas:

  1. Data shaping and transformations – Source data is rarely clean. Excel Power Query and Power BI Desktop queries allow the data analysts to perform a variety of basic and advanced data transformations. For example, these features could help the data analyst shape the data without staging it first to a relational database.
  2. Sophisticated data models – Power BI offers much more advanced modeling experience where a data analyst can build a self-service model on a par with semantic models implemented by BI pros. For example, the model can have multiple fact tables and conformed dimensions. Power BI supports one-to-many and many-to-many relationships.
  3. Powerful programming language – The Data Analysis Expressions (DAX) excels any other vendor’s programming language.

BI and IT Pros

Besides the ability to integrate Power BI to implement synergistic solutions, pros can build hybrid solutions:

  1. Hybrid solutions – Want to get the best of both worlds: always on the latest visuals while leaving data on premises? Power BI lets you connect to your data on premises.
  2. Semantic layer – Many organizations are considering a semantic layer to achieve a single-version of the truth. If your staff is experienced in Power BI modeling, you’ll find that they already have 80-90% of the knowledge they need to implement a Microsoft-based semantic layer with Analysis Services Tabular. This gives you a nice continuum from self-service to organizational BI. For more information about why a semantic layer is important, read my newsletter “Why Semantic Layer?”.

Developers

Developers has much to gain from the Power BI open extensible architecture.

  1. Custom visuals – Power BI allow developers implement custom visuals which can be optionally shared to Power BI Visuals Gallery.
  2. Extensibility – Power BI has a comprehensive REST API that allow developers to integrate Power BI with custom apps. For example, Power BI let developers push data into datasets for real-time dashboards and manipulate deployed objects programatically. Power BI Embedded, currently in preview, allows developers to embed interactive reports without requiring installation of tools and with very attractive licensing model.

Improvement Areas

Here are some areas that customers have identified where Power BI needs improvement:

  1. Direct Query – Currently, Direct Query is limited to a single data source. Microsoft should extend Direct Query to support multiple data sources within a single model.
  2. File size limits – Power BI Service (powerbi.com) is currently limited to 1 GB maximum file size. Some customers have indicated that their data analysts require larger file extracts. My personal advise has been that such large extracts should be avoided in favor of a centralized semantic layer.
  3. DAX complexity – Customers have expressed concerns about the DAX complexity that the lack of quick calculations. Microsoft has already started addressing this by adding the Percent of Total quick calculations. Expect other calculations to light up shortly.
  4. Variables – One large organization transitioning from Qlik/Tableau has pointed out that Power BI lacks variables and parameters, such as to dynamically bind several visualizations to a measure that is chosen as a parameter.
  5. Exporting reports and datasets from Power BI Service to Power BI Desktop – This is currently not supported but high on the Power BI roadmap.
  6. Predictive capabilities – Currently, besides using R or integrating with Azure Machine Learning, Power BI doesn’t have native forecasting capabilities, such as to forecast future months. This is a frequently requested feature and very high on the roadmap.
  7. Drillthrough limitations – Different drillthrough options in Power BI has different limitations. For example, exporting to CSV is limited to 10,000 rows, while drilling through a chart data point is limited to 1,000 rows. This limits will probably lifted in Power BI Pro. Meanwhile, you can use the Analyze in Excel feature and drillthrough in Excel pivot reports which gives you an option to drill through 1,048,576 rows.
  8. Data reduction algorithms – Currently, Power BI visualizations employ data reduction algorithms to limit the number of data points to plot. This is high on the roadmap and there is a work underway to address this limitation.
  9. Replacement for paginated reports – Some organizations have hoped that Power BI can be a replacement of other vendors’ products for paginated (pixel-perfect) reports. Power BI reports are designed for quick data exploration and not as paginated reports. However, SSRS is the Microsoft product for paginated reports. Moreover, SSRS 2016 has been extended significantly to fulfill a very important role in the Microsoft on-premises reporting roadmap.
  10. Maturity – I often hear that Power BI is great but it’s not mature. In my opinion, you should view Power BI to be as mature (or even more mature) as other tools. That’s because the Power BI building blocks have been around for many years, including xVelocity (the in-memory data engine where imported data is stored), Power Query, Power Pivot, Power View, Tabular, and Azure cloud infrastructure.

MS BI Events in Atlanta

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

sig-1
Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Spring 2016

What’s New in SQL Server 2016 for BI?


031316_1550_PrologikaNe2.pngOn a personal note, I’m excited to announce the launch of the new Prologika website (https://prologika.com), which adds a slew of new features to connect better with customers and readers, including site-wide search, responsive web design, case studies, book and blog discussion lists, and more to come. Although the old blog feed should still work, please update it to https://www.prologika.com/feed/. Continuing on the list of announcements, Microsoft added a Prologika Power BI case study to the Power BI partner showcase. Speaking of Power BI, I definitely see a lot of interest from customers in Power BI-based solutions, ranging from self-service BI to white-labeling and report embedding. Last but not list, our Atlanta MS BI group is an official Power BI group! So, if you’re interested in Power BI, check our monthly meetings which now feature more Power BI content.


Spring is here and it brings again a new version of SQL Server. Microsoft launched SQL Server 2016 on March 10th. Its product page include nice videos covering some of the new features. The great news is that the “box” has seen a renewed interest and Microsoft has made significant investments in all the bundled services to help you implement cost-effective and modern data analytics solutions on premises. In this newsletter, I’ll enumerate my favorite BI new features in SQL Server 2016. Feel free to also check my slides on this topic on my LinkedIn profile page.

Tools

The days of waiting years for the next SQL Server release are coming to an end, as you first witness with the client tools.

  • SSMS – You no longer have to run the SQL Server setup just to get SQL Server Management Studio (SSMS). SSMS is now available as a free and standalone download here. Moreover, it will be updated on a monthly basis and it will be backward compatible for all SQL Server supported versions!
  • SSDT – Also, to everybody’s delight, the BI add-on to SQL Server Data Tools (SSDT) is gone. Instead, you just download and install SQL Server Data Tools, which includes the BI projects. No more installing three setup packages to get to the BI stuff. To make your joy complete, SSDT is backward compatible. Actually, SSRS and SSAS have been backward compatible for a while, but now SSIS joins the list so that you can use SSDT to work with legacy SSIS packages.

Database Engine

There are many new features in the Database Engine but the following will be of particular interest to BI practitioners:

  • Updatable columnstore indexes – They will allow you to speed up aggregated queries without having to drop and recreate the columnstore index.
  • Live query statistics – How many times you had to troubleshoot the performance of massive query with many joins? Live query statistics will now show you which joins slows the query down.
  • Temporal tables – Anyone who’s implemented ODS knows that maintaining Type 2 changes is no fun. Temporal tables can maintain changes on any column for you. This feature is also great if you need data change auditing.
  • Integration with R – Leveraging the Revolution Analytics acquisition, the R Server allows your data analysts to run R scripts on top of the SQL Server data. Moreover, DBAs can configure resource limits so that these scripts don’t impact the database performance.

SQL Server Integration Services (SSIS) and Master Data Services (MDS)

I’m somewhat disappointed that the Power Query integration and Lineage Statistics didn’t make the cut. Anyway, here are my favorites:

  • Incremental project deployment – you can just deploy changed packages to the catalog instead of deploying the entire project.
  • Package parts – you can refactor some control flow tasks in reusable package parts that you can manage separately. This could be very beneficial for SSIS “frameworks” so that you don’t have to update all packages if some changes are introduced later in the development cycle.
  • Cloud tasks and connectors – Lots of attention to moving and transforming data in Azure. For example, there is a task that will allow you to move data to Azure Blog storage in the most efficient way. Continuing this line of thought, the fastest way to move the data to Azure SQL DW would be to use Polybase which supports HDInsight and Azure Blob Storage.
  • MDS Entity Sync – Allows you to reuse entities among models. For example, you can implement a Common model with entities, such as Geography, that you can configure for auto synchronization with other models.
  • 15x performance increase in MDS Excel add-in.

SQL Server Reporting Services (SSRS)

As per the Microsoft’s updated reporting roadmap, SSRS comes out of the closet to fulfill its new role of becoming the on-premises platform for paginated (pixel-perfect), mobile, and Power BI Desktop reports (support for Power BI Desktop files in SSRS will happen after SQL Server 2016). SSRS saw a lot of attention in SQL Server 2016 and brings major new enhancements:

  • Better mobile support – SSRS reports now render in HTML5. Users can use the Power BI native apps for iOS, Android and Windows devices to render both SSRS and Power BI reports. ActiveX print control has been replaced with PDF printing that works on all modern browsers.
  • Facelift – SSRS 2016 brings a new report portal (aka Report Manager). Report Builder has a new look too. Charts and gauges have a new modern look. New chart types (Sunburst and Treemap) have been added. You can now add KPIs directly in the Report Portal.
  • Mobile reports – Thanks to the Datazen acquisitions, you can now have in the box reports that specifically target mobile devices, that have similar features as competing vendors, such as PushBI (now part of Tibco) and RoamBI.
  • Parameter area – You can now control the parameter placement. Personally, I expected also more control over parameters, such as parameter validation, but the alas, the wait is not over.
  • Prioritized native report mode – Microsoft now prioritizes SSRS in native mode which is a great news for customers who previously had to adopt SharePoint Enterprise just for BI. In fact, all the new features are available only in SSRS native mode.

SQL Server Analysis Services (SSAS)

As you know by now, I’m a big fan of classic BI solutions that feature a semantic layer (Multidimensional or Tabular). SSAS gets many new features, including:

  • Tabular many-to-many relationships – You can now implement M2M relationships by setting the relationship cross filtering direction to Both, as you can in Power BI Desktop.
  • Tabular Direct Query enhancements – Microsoft put a lot of effort to lift previous Direct Query limitations in Tabular so that you can build Tabular models on top of fast databases without having to cache the data. Direct Query now have better performance, support for row level security, support for MDX clients such as Excel, support for Oracle, Teradata, and Azure DW.
  • New Tabular scripting language – Tabular models are now described in a new lightweight JSON grammar. This speeds up scheme changes, such as renaming columns. In addition, a new Tabular Object Model (TOM) is introduced to help developers auto-generate Tabular models.
  • DAX new functions – Many new DAX functions (super DAX) were introduced.
  • Multidimensional – support for Power BI and Power BI Desktop. Support for Netezza as a data source. Distinct count ROLAP optimization for DB2, Oracle, and Netezza. Drillthrough is now supported with multi-selection, such as then the user filters on multiple values in Excel.

MS BI Events in Atlanta

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

Teo Lachev
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

Prologika Newsletter Winter 2015

Power BI and You


book1I’m excited to announce the availability of my latest (7th) book – Applied Microsoft Power BI! Currently, this is the only book on Power BI. The book page has more information about the book, including the front matter (with foreword by Jen Underwood), source code, and a sample chapter (Chapter 1 “Introduction to Power BI”). You can order the paper copy on Amazon, and soon on other popular retail channels. I predict that 2016 will be the year of Power BI and I hope that this book will help you to take the most out of it and bring your data to life! And if you’re looking for instructor-led training on Power BI and Microsoft BI, please check our training classes.

 


 

Let’s face it. Without supporting data, you are just another person with an opinion. But data is useless if you can’t derive knowledge from it. And this is where Power BI can help you. While writing the book and helping customers use Power BI, I’m astonished by its breath of features and the development momentum Microsoft has put behind it. The Power BI cloud service gets major features every week, while Power BI Desktop is updated every month! Although this makes it hard for people like me who are writing books, it’s a great value proposition for you.

Not to mention that Power BI has the best business model: most of it it’s free! Power BI Desktop and Power BI Mobile are free. Power BI Service is free and has a Power BI Pro subscription option that you could pay for, following a freemium model. Cost was the biggest hindrance of Power BI, and it’s now been turned around completely. You can’t beat free! In this newsletter, I’ll revisit how Power BI can benefit different users in your organization.

IMG_8221

Power BI for Business Users

To clarify the term, a business user is someone in your organization who is mostly interested in consuming BI artifacts, such as reports and dashboards. Business users can utilize Power BI to connect to popular cloud services, such as Salesforce, Marketo, Google Analytics, Dynamics CRM, and many more. With a few clicks, a business user can use content packs to connect to cloud data and gain insights from predefined reports and dashboards, and create custom reports. Other cloud-hosted providers build profitable businesses around SaaS cloud BI but Power BI does it for free!

With a few clicks, business users can analyze data from files and cubes without having to create data models. And they can also view Power BI dashboards and reports on mobile devices so they are always informed while they are on the go. Again, all of this for free!

Power BI for Data Analysts

A data analyst or BI analyst is a power user who has the skills and desire to create self-service data models. Leveraging the Microsoft’s prior investment in Power Pivot, Power View, and Power Query, Power BI lets business analysts import data form virtually everywhere and create sophisticated self-service models whose features are on a par with professional models and cubes. And now that we have a native support for many-to-many relationships, there shouldn’t be a requirement you can’t meet with Power BI.

As a data analyst you have a choice about your toolset because you can create models in both Excel or in Power BI Desktop. While other vendors charge hefty licensing fees for desktop modeling tools, Power BI Desktop is free and it gets updates every month! Think of Power BI Desktop as the unification of Power Pivot, Power Query, and Power View. Previously available as Excel add-ins, these tools now blend into a single flow. No more guessing which add-in to use and where to find it! Because many data analysts use R to data analysis and statistics, Power BI recently added support for R scripts and visualizing data using the R plotting capabilities.

Power BI for Pros

BI pros and IT pros have much to gain from Power BI. An IT pro can establish a trustworthy environment that promotes sharing of BI artifacts. To do so, IT can set up Power BI workspaces that allow authorized users to see the same BI content. If IT needs to distribute BI artifacts to a wider audience, such as the entire organization, she can create an organizational content pack and publish it to the Power BI Content Gallery. Then her coworkers can search, discover, and use the content pack. And IT can set up an organizational gateway to centralize and grant access to on-premises data.

The scenario that BI pros will probably be most excited about is hybrid BI solutions, where the report definitions (not data) is hosted in Power BI but corporate data remains in relational databases and cubes. This is a scenario that Prologika is planning for a cloud-averse Fortune 10 company in order to empower their users with mobile reports and dashboards. But that’s not all! BI pros can also implement predictive and real-time solutions that integrate with Power BI, and book has the details.

Power BI for Developers

Power BI has plenty to offer to developers as well because it’s built on an open and extensible architecture that embraces popular protocols and standards, such as REST, JSON, and oAuth. For years, Microsoft didn’t have a good solution for embedding interactive reports in custom apps. Power BI enables this scenario by allowing developers to embed dashboard tiles and interactive reports. Soon it will also support custom authentication.

Microsoft has also published the required “custom visuals”  interfaces to allow developers to implement and publish custom visuals using any of the JavaScript-based visualization frameworks, such as D3.js, WebGL, Canvas, or SVG. Do you need visualizations that Power BI doesn’t support to display data more effectively? With some coding wizardry, you can implement your own, such as the Sparkline visual that I published to the Power BI visuals gallery!

In summary, no matter what data visualization or data analytics requirements you have, Power BI should be at the forefront and you ought to evaluate its breath of features. Remember that Power BI is a part of a holistic vision that Microsoft has for delivering cloud and on-premises data analytics. When planning your on-premises BI solutions, consider the Microsoft public reporting roadmap. Keep in mind that you can use both Power BI (cloud-based data analytics) and the SQL Server box product on-premises to implement synergetic solutions that bring your data to life!

As you’d probably agree, the BI landscape is fast-moving and it might be overwhelming. If you need any help with planning and implementing your next-generation BI solution, don’t hesitate to contact me. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects, and rest assured that you’ll get the best service.

Regards,

Teo Lachev

Teo Lachev
President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

EVENTS & RESOURCES

Prologika: “Applied Microsoft Power BI Service” book by Teo Lachev
SQL Saturday BI: “What’s New for BI in SQL Server 2016” presentation by Teo Lachev and “Introduction to R” presentation by Neal Waterstreet on 1/9/2016
Atlanta BI Group: Power BI presentation by Patrick LeBlanc on 1/25/2016

 

Prologika Newsletter Fall 2015

Is ETL (E)ating (T)hou (L)ive?

etlBefore we get to the subject of this newsletter, I’m happy to announce the availability of my latest class – Applied Power BI Service. As you’ve probably heard by now, Power BI Service (or Power BI 2.0) is the Microsoft latest cloud-based analytics service with a simple promise: 5 seconds to sign up, 5 minutes to wow! If you’re ready to disrupt how your organization is analyzing data, please contact me to schedule this class at your convenience and get immediate value.


 

What Not To Do

Back to the newsletter subject, let’s start with a story that was inspired by true events as they say in the movies. Not a long time ago, a man wanted a nice and modern house for his family. He hired a couple of well-known builders but they didn’t deliver what the man really wanted. Then, the man hired a third builder who built him a great house (or a close approximation of the grand vision). Everyone was happy and they lived happily ever after…or at least until the man sold the house to another man.

The second owner had more pressing needs and another vision about the house. Not only the house had to accommodate his family but now the house had to entertain hordes of guests so it had to be expanded. But to cut down cost, the second man decided to maintain the house on his own or outsource whatever he can’t do to a cheap builder. The new owner put hastily new rooms and did other renovations as necessary. Expansion and new construction were his highest priorities and there was never time for proper maintenance or to reinforce the house infrastructure so that it can accommodate the new demands. Needless to say, not much time had passed until the infrastructure gave up. For example, it took days for clogged pipes to drain and guests were not happy. Did I mention the man sold his guests the sun and the moon?

What does this have to do with Extraction, Transformation, and Loading (ETL)? Data is rapidly growing nowadays while ETL processing windows are shrinking. You must do more with less. And, ETL usually becomes a performance bottleneck that stands in the way of your current and future BI initiatives

What To Do

How did the story end? The story didn’t end and it will never end. If you have a house, you can just focus on renovations and additions. You must also maintain it and you must budget for it. One day a member of the man’s family did something out of ordinary and the entire infrastructure collapsed. There wasn’t a way to find out why and the family was scurrying around trying to apply quick fixes. Finally, the second man hired hastily the original builder to assess the situation. Among other things that the builder did to resolve the crisis, he recommended changes and proactive maintenance along the following ten tenets:

  1. Parallelism – The chances are that you have an ETL framework that orchestrates package execution, log errors, etc. And, the chances are that the framework executes packages sequentially. With all the bandwidth modern servers have, there is no excuse if your framework doesn’t support parallel execution. That’s because many ETL tasks, such as ODS loads, loading dimensions and independent fact tables, can benefit greatly from parallel execution. For example, at Prologika we use ETL framework that supports a configurable number of parallelism. Once you configure which packages can run in parallel, the framework distributes the packages across parallel flows.
  2. Incremental extraction – If you have small data volumes, you might get away with fully loading the source data but most systems would require incremental extraction. Again, this is something the ETL framework is best suited to handle.
  3. Volume stats – ETL must log in important data volume metrics, such as number of rows extracted, inserted, updated, and deleted. It should also load how many days were processed since the last incremental extraction and additional context that might be useful for troubleshooting purposes, such as what parameters were passed to stored procedures.
  4. Targeted execution – I recommend you add a target execution duration for each package. Then, ETL will log in the actual duration so that you can detect performance deviations from the norm.
  5. Daily monitoring – I suggest you implement and publish a dashboard, such as using Excel Power Pivot, and monitor this dashboard daily. For example, the dashboard should include a Package Execution KPI that flags excessive executions in red based on the performance metrics you established in step 4.
  6. Regression analysis – Once things “normalize”, create an one-time Extended Events session (assuming SQL Server) to capture the query plans for all significant queries. If during daily monitoring you discover a performance deviation, run the session again focusing on that slow package and compare the query plan with the baseline. Analyze both query plans to find if and why they have changed. To make this easier, when SQL Server 2016 ships, consider upgrading to take advantage of the new Query Store feature.
  7. Cold data archiving – If you lots of source data, e.g. billions of rows, consider archiving historical data that no one cares about, such as by uploading to Azure Table storage.
  8. Project deployment – Consider upgrading to SSIS 2012 or above to benefit from its project deployment so that you can get task-level performance analysis in addition to easier development.
  9. Avoid locking – Use “SET TRANSACTION ISOLATION LEVEL READ UNCOMMITED” at the beginning of your stored procedures of freeform SQL to avoid read locks. I prefer using this statement instead of the NOLOCK hint for its brevity and so that I don’t miss a table.
  10. ELT pattern – I saved the best for last. I’m a big fan of the ELT pattern. I usually try to get out as fast as I can from the SSIS designer. Instead of transformations in the ETL data flow, consider ETL pattern for its performance and maintenance benefits. For more information about the ELT pattern, read my blog “3 Techniques to Save BI Implementation Effort.

As you’d probably agree the BI landscape is fast-moving and it might be overwhelming. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects.

Regards,

Teo Lachev

Teo Lachev
President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

EVENTS & RESOURCES

Prologika: Applied Power BI Service training by Prologika (online or instructor-led):
Atlanta BI Group: Enhancing Data Analysis and Predictive Analytics with NoSQL by Cornell A. Emile on September 28th
Atlanta BI Group: ETL Architecture Reusable Design Patterns and Best Practices by Stephen Davis on October 26th

Prologika Newsletter Summer 2015

What’s New for BI in Office 2016?

office

While you might be still trying to convince management to upgrade to Office 2013, the next version of MicrosoftOffice (Office 2016) is on the horizon and scheduled to be released sometime in the Fall. Meanwhile, you can download and test the Public Preview. Naturally, you might wonder what’s new for BI given that Excel is the Microsoft premium BI desktop tool. In this newsletter, I’ll share my favorite Excel 2016 BI features (most significant listed first).


 

Power View Connectivity to Multidimensional Cubes

I don’t know why it took more than two years for Microsoft to add this feature (initially released for Power View in SharePoint), but you will be finally able to use Power View in Excel 2016 to connect to OLAP cubes, just like we can use Power View in SharePoint as a front end to cubes. What this means to you is that you can preserve your OLAP investment by allowing business users to use Power View to build dashboardsand perform ad-hoc data analytics.

pv

Native Power Query Integration

If you follow the Power Query story, you know that it has an increasingly popular place in the Microsoft BI strategy. In fact, all data roads in Power BI 2.0 go through Power Query because of its integration with Power BI Designer. Besides not having to go through extra steps to download and install Power Query, the native integration allows you to manipulate programmatically Power Query objects using VBA, C#, or PowerShell code and record macros, as explained in more details in this document.

Continuing down the integration path, you can now easily enable all “Power” features from File > Options > Advanced > Turn on data analysis features. In the past you had to enable add-in separately using File- >Options >Add-Ins > Manage “Com Add-ins” > Go.

pq

Power Pivot Improvements

As you probably know, your business users can use Power Pivot to build sophisticated self-service BI models that are on a par with organizational models. Moving forward to Office 2016, the Power Pivot models will support many-to-many relationships. Unfortunately, this feature didn’t make it to the Office 2016 public preview but when released it should work in the same way as in the Power BI Designer. Continuing further down the list, you can now rename tables, columns and calculated fields/measures in the Power Pivot add-in and they will automatically propagate to Excel and existing reports. In another words, metadata changes are not breaking changes to existing reports anymore.

And it gets easier to define relationships. When you are building a Data Model PivotTable working with two or more tables with no relationships defined, you get a notification to run Automatic relationship detection. This will detect and create all relationships between the tables that are used for the Data Model PivotTable so that you don’t have to do any of the work yourself. DAX added many new functions which are documented here.

Usability Improvements

Another long anticipated feature made the cut. The Excel Field List added a Search feature to allow the end user to filter names of tables, columns, and measures. This works across all data structures: Power Pivot, Tabular, and Multidimensional.
Similar to native pivot tables, pivot reports connected to Power Pivot models now support Time grouping. For example, this feature allows the end user to group a report at a day level to any time interval, e.g. year and then quarter.

search

Similar to native pivot tables, pivot reports connected to Power Pivot models now support Time grouping. For example, this feature allows the end user to group a report at a day level to any time interval, e.g. year and then quarter.

grouping

Excel regular slicers now add a multi-select option so that the user doesn’t need to remember to hold down the Ctrl key to select multiple items.

slicer

Forecasting Functions

Recognizing the need for time series forecasting, Excel adds forecasting functions, which are documented here. For example, the Forecast.ETS function predicts a future value based on existing (historical) values by using the AAA version of the Exponential Smoothing (ETS) algorithm. In the example below, Forecast.ETS calculates the forecasted value for the date in cell O16, using the historical values in the range P5:P15 across the dates in the time series from the range O5:O15).

=FORECAST.ETS(O16, P5:P15, O5:O15)
The function also supports arguments also for confidence interval and seasonlity.

As you’d agree, the BI landscape is fast-moving and it might be overwhelming. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects.

Regards,

Teo Lachev

Teo Lachev
President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

EVENTS & RESOURCES

Atlanta BI Group: Microsoft and Revolution Analytics: What’s the Add-Value? by Dr. Mark Tabladillo on June 29th
Atlanta BI Group: Sentiment Analysis with Big Data & Machine Learning by Paco Gonzalez on August 31th

Prologika Newsletter Spring 2015


dw1A while back I met with a client that was considering overhauling their BI. They asked me if the traditional data warehousing still makes sense or should they consider a logical data warehouse, Big Data, or some other “modern variant”. This newsletter discusses where data warehousing is going and explains how different data architectures complement instead of compete with each other.

 


QUO VADIS DATA WAREHOUSE?

The following diagram illustrates the “classic” BI architecture consisting of source data, data staging, data warehouse, data model, and presentation layers (click to enlarge the image).

ar

Yes, not only is the data warehouse not obsolete, but it plays a central role in this architectural stack. Almost always data needs to be imported, cleansed and transformed before it can be analyzed. The ETL effort typically takes 60-80% percent of BI implementation effort. And, currently there isn’t a better way to store the transformed data than to have a central data repository, typically referred to as a “data warehouse”.

That’s because a data warehouse database is specifically designed for analytics.
Ideally, most or even all of dimensional data should be centrally managed by business users (data stewards) outside the data warehouse. However, the unfortunate truth is that not many companies invest in master data management. Instead, they kick the data quality can down the road and pay much higher data quality later but this is a topic for another newsletter. A well-thought architecture should also include a data model (semantic) layer, whose purpose and advantages I’ve outlined in my “Why Semantic Layer?” newsletter.

BIG DATA

Irrespective of the huge vendor propaganda surrounding Big Data, based on my experience most of you still tackle much smaller datasets (usually in the range of millions or billions at worst). This is not Big Data since a single (SMP) database server can accommodate such datasets. This is a good news for you because Big Data typically equates big headaches. For those of you who truly have Big Data, its implementation should complement, instead of replace, your data warehouse. Even though the advancements in the popular Big Data technologies are removing or mitigating some of the Big Data concerns, such as slow queries, these technologies are still “write-once, read many”, meaning that they are not designed for ETL and data changes.

Moreover, a core tenant of data warehousing is providing a user-friendly schema that supports ad-hoc reporting and self-service BI requirements. By contrast, BI Data is typically implemented as a “data lake” where data is simply parked without any transformation. For more information about Big Data and how it can fit into your BI architecture, read my Big Data newsletter.

LOGICAL DATA WAREHOUSE

Recently, logical data warehouses (LDW) have gained some traction and backing from vendors, including Composite (acquired by Cisco), Denado, and others. Logical data warehousing is also known as data federation and data virtualization. The idea is simple – consolidate and share the data in a controlled manner to all users and applications across the enterprise. Data is made available as virtual views on top of existing data sources, with additional features, such as discovery and caching. Again, the goal here is not to replace the traditional data warehouse, but make its data, plus the data from other systems, readily available for self-service BI and/or custom applications.

Logical data warehousing is at a very early stage of adoption. In my opinion, the companies that will benefit most of it are large organizations with many data repositories, where data availability is a major barrier for enabling self-service BI. If you believe that your organization might benefit from a Logical Data Warehouse, you might not need to make a major investment. If your organization has an Office 365 Power BI subscription, your first step could be leveraging the Power Query capabilities for dataset sharing and discovery. This process can work like this:

  1. Designated users promote virtual views in the form of Power Query queries to Office 365.
  2. A data steward verifies and approves these datasets.
  3. Once signed in to Office 365, other users can search, preview these queries, and import the associated data in self-service BI models.

The following snapshot shows how a business user can search and preview a published query that returns Product List data.

cl

One caveat is that these shared queries are currently Power Query-specific and they can only consumed by Microsoft self-service BI tools, which currently include Excel and Power BI Designer. I recommended to Microsoft to expose shared queries as ODATA feeds to enable additional integration scenarios.

SELF-SERVICE BI

Self-service BI allows power users to create their own data models by using specialized tools, such as Power Pivot, Tableau, QlikView, and so on. Despite what some “pure” self-service BI vendors proclaim, self-service BI is not a replacement for data warehousing. In fact, a data warehouse is often a prerequisite and a major enabler for self-service BI. That’s because a data warehouse provides consumers with clean and trusted data that can be further enriched with external data

As you’d agree, the BI landscape is fast-moving and it might be overwhelming. As a Microsoft Gold Partner and premier BI firm, you can trust us to help you plan and implement your data analytics projects.

Regards,

Teo Lachev

Teo Lachev
President and Owner
Prologika, LLC | Making Sense of Data
Microsoft Partner | Gold Data Analytics

EVENTS & RESOURCES

PASS Data Analytics Conference, April 20-22, Santa Clara, California
Atlanta BI Group: Overview of R by Neal Waterstreet on March 30th
Friendlier Data Profiling with the SSIS Data Profiler Task by Julie Smith on April 27th
SQL Saturday Atlanta on May 16th