Tabular M2M Relationships on the Horizon

One of the biggest strengths of Microsoft self-service BI is the ability to create sophisticated data models on a par with organizational BI models built by professionals. This fact is often overlooked when organizations evaluate self-service tools and the decision is often made based on other factors but not insightful understanding of the data model capabilities. This is unfortunate because most popular tools on the market don’t go much further than supporting a single dataset. By contrast, Power Pivot allows you to import easily multiple datasets from virtually anywhere and join the resulting tables as you can do in Microsoft Access. This brings tremendous flexibility and analytical power.

Unlike multidimensional cubes, one of the limitations of the Power Pivot and Tabular data models has been the lack of support for declarative many-to-many relationships. The workaround has been using a simple DAX formula to resolve the relationship over a bridge table, such as =CALCULATE (SUM (Table[Column] ), <BridgeTable>) but this approach might present maintenance issues, as you have to create multiple calculated measures to support different slicing and dicing needs. However, as pointed out in my latest newsletter, the upcoming version of Power BI aims to remove adoption barriers and adds new features. And, one of this features, is bidirectional relationships and declarative support of M2M relationship, which Chris Webb already wrote about.

To test the M2M relationship, I attempted to create the same M2M scenario that I used in my book, which models a joint bank account. The corresponding Power Pivot schema is shown below. The CustomerAccount table is the bridge table that resolves the M2M relationship (a customer might have many accounts and a bank account might be shared by multiple customers). The Balances table stores the account balances over time and the Date table lets us analyze these balances over time.

010415_1902_TabularM2MR1

Setting up a M2M relationship in the Power BI Designer is achieved by changing the “Cross filter direction” relationship setting to Both. This setting and bi-directional relationships are described in more details here.

010415_1902_TabularM2MR2

Indeed, creating a report that shows balances by customer resolves the M2M relationship and aggregates correctly.

010415_1902_TabularM2MR3

Unfortunately, attempting to slice the report by Date returns an error in the preview version of the Power BI Designer so the M2M feature is still a work in progress. Brining this further, a useful addition could be declarative semi-additive functions to allow the user to set the aggregation behavior of the Balance measure, such as to LastNonEmpty. Similar to Multidimensional, this will avoid the need for user-defined explicit measures.

Getting ETL Task Duration

Happy New Year!

ETL exceeds the processing time window? Optimizing ETL, starts with obtaining task-level execution times? If you use SSIS 2012 project deployment mode, task-level stats are already loaded in the SSIS catalog and you can use the following query:

SELECT execution_id,

CASE WHEN [status] = 1 THEN ‘created’

WHEN [status] = 2 THEN ‘running’

WHEN [status] = 3 THEN ‘canceled’

WHEN [status] = 4 THEN ‘failed’

WHEN [status] = 5 THEN ‘pending’

WHEN [status] = 6 THEN ‘ended unexpectedly’

WHEN [status] = 7 THEN ‘succeeded’

WHEN [status] = 8 THEN ‘stopping’

WHEN [status] = 9 THEN ‘completed’

END AS [status_text],

DATEDIFF(ss,start_time,end_time) DurationInSeconds

FROM catalog.executions e

What if you are not on SSIS 2012 or later yet or you are not using the project deployment mode or a framework that logs the task duration? You can still obtain the task duration but you need to enable SSIS logging for each package you want to monitor, as follows:

  1. Open the package in BIDS/SSDT.
  2. On the SSIS menu, click Logging. Configure logging to use the SSIS Log Provider for SQL Server. The provider will save the statistics in a SQL Server table so you can easily query the results.010215_2129_GettingETLT1
  3. On the Details tab, select the OnPreExecute, OnPostExecute, and most importantly the OnProgress event so you can get the same level of execution statistics as in the BIDS/SSDT Progress tab.010215_2129_GettingETLT2
  4. Once you configure the SSIS Log Provider for SQL Server, it will create a sysssislog table in the database you specified when you configured the provider. When the SQL Server Agent executes your package, you can use a query like the one below to obtain task-level durations:

    SELECT executionid Execution,

    PackageName = ( SELECT TOP 1 source FROM dbo.sysssislog S WHERE S.executionid = L.executionid AND S.[event] = ‘PackageStart’ ),

    PackageStep = l.source,

         Runtime_min = DateDiff(minute, MIN(L.starttime), MAX(L.endtime)),

    Runtime_hr = DateDiff(hour, MIN(L.starttime), MAX(L.endtime)),

    StartTime = MIN(L.starttime),

    EndTime = MAX(L.endtime),

    FinishDate = CAST(MAX(L.endtime)AS DATE)

    FROM dbo.sysssislog L WHERE L.[event] != ‘PackageStart’
    GROUP BY executionid, L.source

2015 Annual TI Forecast by TEKsystems

TEKSystems has been a wonderful sponsor of the Atlanta MS BI Group. They’ve recently published an interesting 2015 Annual TI Forecast report. According to the report, Business Intelligence/Big Data will be among the top most impactful technologies in 2015. More key facts:

  • Seventy-one percent of IT leaders report confidence in their ability to satisfy business demands in 2015, representing an increase from 66 percent and 54 percent in forecasts for 2014 and 2013, respectively.
  • The top five areas where most IT leaders expect to increase spending in 2015 include security (65 percent), mobility (54 percent), cloud (53 percent), BI/Big Data (49 percent) and storage (46 percent). Twenty-nine percent of IT leaders also expect to increase spending on ERP.
  • Seventy-three percent of IT leaders indicate that operational objectives such as reducing costs, improving efficiency, consolidating, standardizing and streamlining present the biggest organizational challenges.
  • Salary increases are most likely to be average, with 68 percent of IT leaders saying that they expect overall staff salaries to increase by up to 5 percent. Only 8 percent expect increases of 6 percent or more and 21 percent expect salaries to remain the same.
  • Hiring expectations have also slowed. Entering 2014, 47 percent of IT leaders expected an increase in full-time IT staff hiring. Entering 2015, just 40 percent expect an increase, and 50 percent expect it to be the same as 2014.

Atlanta MS BI Group Meeting Tonight

Come and join us tonight for last 2014 meeting of the Atlanta MS BI Group. In the spirit of the season, I’ll revisit its most important tools and their role in a holistic and modern data analytics environment. Then, for each tool, I’ll discuss its indented use, as well as its pros and cons. We’ll discuss self-service and organizational BI, on-premise and cloud, emerging technologies, and how they complement each other in the context of Microsoft BI. And, Mark Tabladillo will do us a cool demo of the Azure Machine Learning Web Service. $60 Pizza Hut gift card and other cool door prices from Aspen Brands will be given away. Kudos to our fantastic sponsor TEKSystems for buying us food and drinks!

Embedded Power View and Pivot Reports

I’ve been pestering Microsoft for years to provide an embedded Analysis Services Viewer control (similar to the SSRS ReportViewer) that would allow developers to embed interactive reports on custom Windows Forms and web applications. And, for years nothing happened, even after Microsoft acquired the Dundas OLAP Chart control in 2008. There are some positive signs on that end lately. Microsoft just rolled out the ability to embed Power View and pivot reports on a webpage or blog. I’m sure there are some scenarios that will be benefit from this feature but this is really not what I want because:

  1. It’s just an URL-based mechanism targeting deployed reports and its customization options are limited to layout adjustments.
  2. It’s not a control that developers can customize, such as to change the connection string in order to pass custom user credentials, replace parameters, etc.
  3. It requires the reports to be hosted in Office 365. Hence, at least for now, this feature can’t be used with on-prem data.

SQL PASS Summit 2014 Links

Don’t miss the gist of SQL Pass Summit 2014.

Keynote Day One: http://www.sqlpass.org/summit/2014/PASStv.aspx?watch=7Pum0vfYtSk

Keynote Day Two: http://www.sqlpass.org/summit/2014/PASStv.aspx?watch=g8DSwPjmLv4

All PASStv sessions can be found here: http://www.sqlpass.org/summit/2014/PASStv.aspx

Presenting at Dama-Georgia and Atlanta BI Group

I’ll present at DAMA Georgia Chapter on November 12. The topic will be “Best Practices for Establishing a Solid BI Foundation”. For more details, please visit the event page.

Don’t know where to start with BI or if you’re on the right track? Just like everything else, a successful BI rollout is based on a solid foundation. Targeting BI managers, technology officers, and architects, this advisory and technical session presents proven best practices to implementing BI for mid-size and large organizations. I’ll present approaches and recommendations for the main layers of the BI architectural stack, ranging from staging databases, data marts and warehouses, semantic layers, and reporting tools. We’ll discuss self-service and organizational BI, Big Data, and emerging technologies, and how they complement each other. Some of the concepts will be accompanied by demos using the Microsoft BI stack.

Then, on December 15th, I’ll present “Microsoft BI 2014 Review” at the Atlanta BI Group.

In the spirit of the season, join us to reflect on the state of Microsoft BI Platform at the end of year 2014. I’ll revisit its most important tools and their role in a holistic and modern data analytics environment. Then, for each tool, I’ll discuss its indented use, as well as pros and cons. We’ll discuss self-service and organizational BI, on-premise and cloud, emerging technologies, and how they complement each other in the context of Microsoft BI.

If you are in Atlanta, I hope you can join me to talk data analytics.

Operational BI with Azure Stream Analytics

There is a lot of talk nowadays about Internet of Things (IoT). According to Gartner, there will be nearly 26 billion IoT devices by 2020. Naturally, the data generated by these devices needs to processed and analyzed, very often in real time. Indeed, an increasing number of customers need real-time (operational) analytics performed over a stream of events, such as data coming from sensors, barcode readers, social streams, and all sorts of other devices. Currently, .NET developers could use SQL Server StreamInsight to implement on-premise custom CEP (complex event processing) solutions. However, implementing StreamInsight-based applications is not easy as it requires solid .NET and LINQ skills.

Today, Microsoft announced the public preview of the Azure Stream Analytics service that allows organizations to perform stream analytics in the cloud. What’s interesting is that Microsoft made a significant effort to simplify CEP with the promise that “you can be up and running in minutes”. To that end another cloud service, Azure Event Hubs, simplifies the process of intercepting (sinking) events. And, instead of using .NET LINQ, developers can use Stream Analytics Query Language which has a SQL-like syntax for coding standing queries over the event streams, such as:

SELECT DateAdd(second,-5,System.TimeStamp) as WinStartTime, system.TimeStamp as WinEndTime, DeviceId, Avg(Temperature) as AvgTemperature, Count(*) as EventCount FROM input GROUP BY TumblingWindow(second, 5), DeviceId

At the same time, Azure Stream Analytics preserves the advanced features of StreamInsight, such as windowing. The results of the standing queries can be saved to Azure SQL Database, Azure Blob storage, and Azure Event Hub, for further analysis, such as by using Excel. For more information about Azure Stream Insight and to subscribe for the public preview, visit the service home page.

103014_0141_Operational1

I’m excited and expect to see a lot of interest around the Azure Stream Analytics service. If this sounds interesting and you need help, as a Microsoft Gold Partner and premier BI firm, Prologika can help you get started in a cost-effective way, such as by using your Software Assurance vouchers to deliver consulting services around data analytics, such as to implement a POC.

Atlanta MS BI Group Meeting on Oct 27th

Join us on Monday, October 27th for our next meeting of Atlanta MS BI Group to learn about predictive analytics and how to actually do it.

Presentation:Mine Craft
 Level: Intermediate
Date:Monday, October 27th, 2014
Time6:30 – 8:30 PM ET
Place:South Terraces Building (Auditorium Room)

115 Perimeter Center Place

Atlanta, GA 30346

Overview:Why you should be mining your data and how to actually do it. Every company needs a rock star. We want it to be you. This session will give real world examples of data mining successes as well as walk you through how to get started down the path of data enlightenment, so that you too can say “I Am A Data Miner℠”.
Speaker:Mark Tabladillo provides enterprise data science analytics advice and solutions. He uses Microsoft Azure Machine Learning, Microsoft SQL Server Data Mining, SAS, SPSS, R, and Hadoop (among other tools). He works with Microsoft Business Intelligence (SSAS, SSIS, SSRS, SharePoint, Power BI, .NET). Mark has a national leader in analytics and data science (data mining and machine learning) through conference speaking and instructional leadership since 1998. He connects with people on LinkedIn and Twitter @marktabnet.

David McFarland is a Senior Manager Business Intelligence with RentPath, Inc. David spends the vast majority of his day trying to get out of useless meetings. He has no certifications whatsoever and is pretty sure Microsoft has no idea who he is, except when it’s time to renew enterprise software agreements.

Sponsor:Tegile
With demand growing for bigger data and faster service, your data storage choices can make or break your business. Accelerate your business with Tegile’s all-flash and hybrid storage solutions.

Sybase Integration

A Major League Baseball team engaged us to implement the foundation of their data analytics platform. They partner with TicketMaster for ticketing and sales. Interestingly, besides the TicketMaster cloud hosting that everyone is familiar with, Ticketmaster also offers a client application called Archtics to allow customers to sell tickets on premise. The client application uses a Sybase database (as you probably know, Sybase was acquired by SAP) that syncs with the host. Fortunately, Sybase and SQL Server has a lot in common but in the process we had to figure out a way to pull data from the Sybase database. To do so, you need to follow these steps:

  1. Install the SQL Anywhere Database Client Download. During development, you will need the 32-bit driver because BIDS/SSDT is a 32-bit app. When running via SQL Server Agent, you’ll need the 64-bit driver.
  2. Create an ODBC data source. Again you need to create two ODBC data sources using the 64-bit ODBC Administrator and then 32-bit ODBC Administrator.
  3. Use the ODBC Source in SSIS to extract data from Sybase. Unfortunately, the ODBC Source doesn’t support parameterized statements, such as to extract data incrementally. As a workaround, you can use an expression-based SQL command text. You can do this by clicking on the Data Flow task in the package control flow and setting up an expression for the SQLCommand property.

    101914_2246_SybaseInteg1