Posts

Operational BI with Azure Stream Analytics

There is a lot of talk nowadays about Internet of Things (IoT). According to Gartner, there will be nearly 26 billion IoT devices by 2020. Naturally, the data generated by these devices needs to processed and analyzed, very often in real time. Indeed, an increasing number of customers need real-time (operational) analytics performed over a stream of events, such as data coming from sensors, barcode readers, social streams, and all sorts of other devices. Currently, .NET developers could use SQL Server StreamInsight to implement on-premise custom CEP (complex event processing) solutions. However, implementing StreamInsight-based applications is not easy as it requires solid .NET and LINQ skills.

Today, Microsoft announced the public preview of the Azure Stream Analytics service that allows organizations to perform stream analytics in the cloud. What’s interesting is that Microsoft made a significant effort to simplify CEP with the promise that “you can be up and running in minutes”. To that end another cloud service, Azure Event Hubs, simplifies the process of intercepting (sinking) events. And, instead of using .NET LINQ, developers can use Stream Analytics Query Language which has a SQL-like syntax for coding standing queries over the event streams, such as:

SELECT DateAdd(second,-5,System.TimeStamp) as WinStartTime, system.TimeStamp as WinEndTime, DeviceId, Avg(Temperature) as AvgTemperature, Count(*) as EventCount FROM input GROUP BY TumblingWindow(second, 5), DeviceId

At the same time, Azure Stream Analytics preserves the advanced features of StreamInsight, such as windowing. The results of the standing queries can be saved to Azure SQL Database, Azure Blob storage, and Azure Event Hub, for further analysis, such as by using Excel. For more information about Azure Stream Insight and to subscribe for the public preview, visit the service home page.

103014_0141_Operational1

I’m excited and expect to see a lot of interest around the Azure Stream Analytics service. If this sounds interesting and you need help, as a Microsoft Gold Partner and premier BI firm, Prologika can help you get started in a cost-effective way, such as by using your Software Assurance vouchers to deliver consulting services around data analytics, such as to implement a POC.

Big Data Twitter Demo

So, you have (or heard of) classic BI, self-service BI, Big Data BI, descriptive BI, predictive BI or even prescriptive BI, but do you have real-time BI? I’ve been doing quite bit of research and work in that area lately. As you could imagine, real-time BI requires a different architecture that is capable of processing streams of data (sometimes thousands of events) in real time. The Microsoft premium technology for Complex Event Processing (CEP) is StreamInsight (requires a SQL Server license). Microsoft has also a lightweight, open-source .NET library called RX which does event streaming but it doesn’t have many of the StreamInsight features, such as windowing. To demonstrate how classic BI, Big Data, and real-time BI can play together, Microsoft put together a great sample – Big Data Twitter Demo.

The demo allows you to subscribe to one or more Twitter topics of interest. It uses StreamInsight to listen to the Twitter activity and extract tweets that match the topics you “subscribe” to. In the screenshot below, I’m intercepting tweets about Microsoft and SQL Server. Then the demo saves the results to a SQL server table for offline analysis with PowerPivot and Power View (a sample Excel workbook with reports is included). In addition, the demo stores the results in a SQL Azure Hadoop cluster (HDInsight). I guess the idea is to truncate the operational SQL Server store on a regular basis while archiving all data on Hadoop for future analysis. The demo also includes a dashboard that displays the matching tweets and hit rate in real time. Behind the scenes, the application uses Web Sockets (IMO, SignalR would have been a better choice here since Web Sockets have limited browser and platform support) to communicate with the JQuery code on the client which updates the dashboard content. For more information about how all of this work, Mike Wilmot covers the demo in more details.

062013_2135_BigDataTwit1

This is a very impressive demo and I can imagine how much effort went into building it. I personally believe that we’ll see more demand for real-time applications, especially coupled with predictive analytics, such as detecting outliers or forecasting volumes.On the downside, a few days after Microsoft released the demo, Twitter discontinued Basic Authentication, which the demo uses to authenticate with Twitter (you need a Twitter account to run it). Twitter now uses OAuth so I had to tweak the code. Specifically, I added the OAuthTokens.cs and WebRequestBuilder.cs from the Patrick Smith’s Twitterizer library to the StreamInsight.Demos.Twitter.Common class library in the demo. In the same library, I changed the Read method in the TwitterStreaming class as follows:

public TextReader Read() {

var url = GetURL();

// Basic Authencation – Obsolete

//var request = HttpWebRequest.Create(url);

//request.Timeout = _config.Timeout; 

//request.Credentials = new NetworkCredential(_config.Username, _config.Password);

//var response = request.GetResponse();

// Twitter uses OAuth now which is much more complex to implement so you need wrapper classes, such as Twitterizer

OAuthTokens tokens = new OAuthTokens();

tokens.ConsumerKey = “<your Twitter consumer key>”;

tokens.ConsumerSecret = “<your Twitter consumer secret>”;

tokens.AccessToken = “<your Twitter access token>”;

tokens.AccessTokenSecret = “<your Twitter access token secret>”

WebRequestBuilder requestBuilder = new WebRequestBuilder(new Uri(url), HTTPVerb.GET, tokens);

var response = requestBuilder.ExecuteRequest();

return new StreamReader(response.GetResponseStream());

}

And, to get the dashboard to work, I had to use Safari since for some obscure reason Web Sockets won’t work for me with IE 10 on Windows 8. If I have time, I plan to cover StreamInsight and real-time BI in more details in future posts.