Posts

Your Data Warehouse in the Cloud?

I spoke with a customer today that has implemented Salesforce.com. For those who are not familiar with Salesforce, it’s a popular cloud-based, customer relationship management (CRM) product. As it usually happens, just when you’re done with the Salesforce implementation, you’re immediately faced with the challenge of consolidated reporting. It won’t be long before the Vice President of Sales asks you to integrate sales data residing in Salesforce.com and your on-premise data sources. In this case, the customer went to Dreamforce (the Salesforce premier conference) in search for integration options and was advised to solve the report consolidation issue by … migrating their multi-terabyte data warehouse to Salesforce.com!

I’m sure that this approach makes perfect sense to Salesforce.com, but it’s hardly in the customer’s best interest. First, although Salesforce is extensible and you can add custom objects (tables), Salesforce.com is not designed to host relational databases. As far as I know, it doesn’t have ETL tools, an analytical layer, and comprehensive reporting capabilities. Second, even with the enormous recent strides in cloud computing and ever decreasing storage prices, it’s hard to imagine anyone moving a data warehouse to the cloud. It’s just cost-prohibitive to do so. Third, there are data logistics challenges to populate a cloud-based data warehouse, such as to upload gigabytes of data from on-premises databases to the cloud over Internet.

At Prologika, we advise our customers to keep data where it belongs: operational data in the on-premise data warehouse and sales data in the Salesforce.com cloud. By design, the data warehouse is an enterprise repository for storing and consolidating data from operational data sources. We design Integration Services packages to retrieve data by integrating with the Salesforce.com web service and importing the data into the data warehouse. This opens all kinds of interesting data analytical possibilities, such as implementing forecasting reports that combine actual and opportunity revenue.

When the customer requires reports that source data from the data warehouse, we implement a web service end point residing on the customer’s premises that allows Salesforce.com to pull data from the data warehouse and cubes. Or, if it’s acceptable for the sales people to be logged to the customer’s network, we extend Salesforce to pass parameters to operational reports hosted on premise, such in SharePoint. This bi-directional integration allows our customers to keep data where it belongs but allows each system to obtain data from the other system.

Sometimes, it’s better to keep your head and data out of the cloud no matter how modern and exciting this might sound.

101112_0155_YourDataWar1

Microsoft Case Study for Recall and Prologika

Microsoft published a case study “Records Management Firm Saves $1 Million, Gains Faster Data Access with Microsoft BI”. Prologika architected the data warehouse, OLAP cubes, and presentation layer consisting of operational reports, SharePoint management dashboard and Power View reports.

Recall, a records-management firm, needed faster access to key performance indicators and more intuitive business intelligence (BI) tools. The company consolidated four data centers into a Microsoft SQL Server 2012 data warehouse. The solution’s performance enhancements speed employee access to more detailed data. By consolidating into a data warehouse, the company saved U.S. $1 million in hardware and licensing costs…With help from Microsoft partners Prologika and Berg Information Technology, Recall started deployment in August 2011 and went into production in February 2012.

BIDS Helper 1.6 Beta Released

Fellow MVPs have just released the latest public beta build of BIDS Helper, which should one of the first utilities you install after you install SQL Server on your machine. Besides fixes and updates, this release adds support for SQL Server 2012 and new features specific to Analysis Services Tabular.

This beta release is the first to support SQL Server 2012 (in addition to SQL Server 2005, 2008, and 2008 R2). Since it is marked as a beta release, we are looking for bug reports in the next few months as you use BIDS Helper on real projects. In addition to getting all existing BIDS Helperfunctionality working appropriately in SQL Server 2012 (SSDT), the following features are new.

  • Analysis Services Tabular
  • Smart Diff
  • Tabular Actions Editor
  • Tabular HideMemberIf
  • Tabular Pre-Build

Happy New Year 2012!

As 2011 is winding down, it’s time to reflect on the past and plan for the future. 2011 has been a very exciting year for Microsoft BI and me.

  1. Gartner positioned Microsoft as a leader in the 2011 Magic Quadrant for Business Intelligence Platforms.
  2. Although SQL Server 2012 will technically ship early next year, we can say it’s a done deal as it’s currently in a release candidate phase. The most important news from a BI perspective is the evolution of the Business Intelligence Semantic Model (BISM), which an umbrella name for both Multidimensional and Tabular models.
  3. The Tabular model provides us with a nice personal (PowerPivot for Excel)-team (PowerPivot for SharePoint)-organizational (Analysis Services Tabular) continuum on a single platform.
  4. Power View extends the BI reporting toolset with a sleek web-based reporting tool for authoring highly interactive and presentation-ready reports.
  5. In its second release, Master Data Services (MDS) comes out of age and now allows end users to use Excel to manage master data. The newcomer, Data Quality Services (DQS), complements MDS nicely in the never-ending pursuit for clean and trusted data. Integration Services has also nice enhancements. Finally, columnstore indexes will help to aggregate large datasets, such as the scenario I mentioned in this blog.

Looking forward to 2012 and beyond, here is my top 5 BI wish list:

  1. Extending the Tabular capabilities with more professional features, such as scope assignments, role-playing dimensions, MDX query support, and so on, to enhance its reach further in the corporate space. Ideally, I expect at some point in future unification of Multidimensional and Tabular so BI pros don’t have to choose a model.
  2. Extending Power View to support multidimensional cubes. Further, in the reporting area, I expect an embeddable web-based OLAP browser (it’s time for Dundas Chart to come back to live) and an improved MDX query designer (no, I haven’t lost hope for this one).
  3. Enhanced Excel BI capabilities so Excel becomes the BI tool of choice. This includes supporting PowerPivot natively and overhauling the reporting capabilities beyond the venerable PivotTable and PivotChart. Ideally, what I am hoping for is decoupling Power View from SharePoint and integrating it with Excel and custom web applications. Power View is too cool to be confined onlyin SharePoint.
  4. Extending Microsoft Azure with BI capabilities to let solution providers host BI models in the cloud.
  5. Bringing BI to mobile devices.

On the personal side of things, I’ve been fortunate to stay healthy and busy (very busy). The Atlanta BI group, which I am leading, has grown in size and we now enjoy having 40-50 people attending our monthly meetings. For the past few months, I’ve been working on my next book, Applied Microsoft SQL Server 2012 Analysis Services (Tabular Modeling), which I expect to get published in March. And, my consulting business has been great!

I wish you a healthy and prosperous year! I hope to meet many of you in 2012. Meanwhile, you can find me online at the usual places: www.prologika.com | blog | linkedin | twitter.

Happy New Year!

MVP For Another Year!

Just got the news that my MVP status got extended for another year! This will make it six consecutive years as MVP and as a member of an elite group of professionals that I am proud of belonging to.

Most Requested Features

You can use the Microsoft Connect website to find most requested features. Unfortunately, the search doesn’t let you specify a product so the search results may be related to other products. For example, searching on reporting services may bring in results from Analysis Services and reporting. Nevertheless, it was quite interesting to find the top voted suggestions. For example, the following query shows the top suggestions for Reporting Services (flip to the Suggestions tab):

https://connect.microsoft.com/SQLServer/SearchResults.aspx?KeywordSearchIn=2&SearchQuery=%22reporting%22+AND+%22services%22&FeedbackType=1&Scope=0&SortOrder=10&TabView=0&wa=wsignin1.0

  1. Reporting Services-Recognize multiple result sets returned from a stored procedure – (50 votes)
  2. Merging / Linking datasets on report level (50 votes) – No 6 on my SSRS Top 10 wish list.
  3. SQL Reports should support stylesheets (43 votes) – No 9 on my list.
  4. Support for XML Paper Specification (XPS) Output Format (29 votes) – I am personally surprised about this one.
  5. Reporting Services Security Using Membership and Roles (29 votes), and so on

Microsoft Live Labs Pivot

In case you’ve missed this, the Pivot era has begun. After Excel PivotTable and PivotChart, we’ll have PowerPivot in SQL Server 2008 R2. But Pivot evolves… A co-worker showed me today a glimpse of the Pivot future which I guess is the Microsoft Live Labs Pivot. Since grids and charts are not cool anymore we now have pictures and animation. It’s hard for me to understand at this point how this would apply to Business Intelligence but the Silverlight app with all these pictures sure looks catchy.

Long live Pivot!

Cumulative Update 3 for SQL Server 2008 Service Pack 1

Microsoft released today a Cumulative Update 3 for SQL Server 2008 Service Pack 1. Among other things it fixes the Report Builder 2.0 ClickOnce deployment issue on x64 which I reported here.

Google – The Best Thing that Ever Happened to Microsoft

On a different subject, you’ve probably heard the news: Google will release an operating system called Google Chrome OS which will challenge Windows and instill fear into the Microsoft camp. To the contrary, I think Google is the best thing that ever happened to Microsoft. How come?

I remember reading somewhere that after the perestroika, Mikhail Gorbachev had supposedly told Ronald Reagan “we’ll now do the worst thing to you (USA); we’ll leave you without enemy”. Perhaps, too extreme but paraphrased to business, completion is a good thing. When challenged, good companies become better, products improved and consumers benefit. So, I hope Google will continue expanding their ambitions. Similarly, I hope Microsoft gives Google a run for its money by competing relentlessly to increase their share of the search market. I’d personally love to see business intelligence features added to the search results. Why not, the first two letter s of bing are BI, right? How come I can’t even sort the search results by date to see the most recent matches on top? A chart showing the popularity of the search item over time? People who searched for X searched also for Y?

It will be interesting to see how this mega competition will evolve in time. As Chinese say “may you live in interesting times”.

Transmissions from TechEd USA 2009 (Day 3)

Today was a long day. I started by attending the Richard Tkachuk’s A First Look at Large-Scale Data Warehousing in Microsoft SQL Server Code Name “Madison”. Those of you familiar with Analysis Services would probably recognize the presenter’s name since Richard came from the Analysis Services team and maintains the www.sqlservernanalysisservices.com website. He recently moved to the Madison team. Madison is a new product and it’s based on a product by DATAllegro which Microsoft acquired sometime ago. As the session name suggests, it’s all about large scale databases, such as those exceeding 1 terabyte of data. Now, this is enormous amount of data that not many companies will ever amass. I’ve been fortunate (or unfortunate) that I never had to deal with such data volumes. If you do, then you may want to take a look at Madison. It’s designed to maximize sequential querying of data by employing a shared-nothing architecture where each processor core is given dedicated resources, such as a table partition. A controller node orchestrates the query execution. For example, if a query spans several tables, the controller node parses the query to understand where the data is located. Then, it forwards the query to each computing node that handles the required resources. The computing nodes are clustered in a SQL Server 2008 fail-over cluster which runs on Windows Server 2008. The tool provides a management dashboard where the administrator can see the utilization of each computing node.

Next, I attended the Fifth Annual Power Hour session. As its name suggests, TechEd has been carrying out this session for the past five years. The session format was originally introduced by Bill Baker who’s not longer with Microsoft. If you ever attended one of these sessions, you know the format. Product managers from all BI teams (Ofice, SharePoint, PerformancePoint, and SQL Server) show bizarre demos and throw t-shirt and toys to everything that moves (OK, sits). The Office team showed an Excel Services demo where an Excel spreadsheet ranked popular comics characters. Not to be outdone, the PerformancePoint team showed a pixel-based image on Mona Lisa. Not sure what PerformancePoint capabilities this demonstrated since I don’t know PerformancePoint that well but it looked great.

The Reporting Services team showed a cool demo where the WinForms ReportViewer control would render a clickable map (the map control will debut in SQL Server 2008 R2) that re-assigns the number of Microsoft sales employees around the US states. For me, the coolest part of this demo was that there was no visible refresh when the map image is clicked although there was probably round tripping between the control and the server. Thierry D’Hers later on clued me in that there is some kind of buffering going on which I have to learn more about. This map control looks cool! Once I get my hands on it with some tweaking maybe I’ll be able to configure it as a heat map that is not geospatial.

Finally, Donald Farmer showed another Gemini demo which helped learn more about Gemini. I realized that 20 mil+ rows were compressed to 200 MB Excel file. However, the level of compression really depends on the data loaded in Excel. Specifically, it depends on the redundant values in each column. I learned that the in-memory model that is constructed in Excel is implemented as in-process DLL whose code was derived from the Analysis Services code base. The speed of the in-memory model is phenomenal! 20 mil rows sorted within a second on the Donald’s notebook (not even laptop, mind you). At this point Microsoft hasn’t decided yet how Gemini will be licensed and priced.

As usual, after lunch I decided to hang around in the BI learning center and help with questions. Then, it was a show time for my presentation! I don’t why but every TechEd I get one of these rooms that I feel intimidated just to look at them. How come Microsoft presenters who demo cooler stuff than mine, such as features in the next version, get smaller rooms and I get those monstrous rooms? It must be intentional; I have to ask the TechEd organizers. The room I got was next to the keynote hall and could easily accommodate 500-600 people, if not more. Two years ago, I actually had a record of 500+ people attending my session which was scheduled right after the keynote.

This year, the attendance was more modest. I don’t have the final count yet, but I think about 150+ folks attended my session so there was plenty of room to scale up. I think the presentation well very well. The preliminary evaluation reports confirm this. I demoed report authoring, management, and delivery tips sprinkled with real-life examples. We had some good time and I think everyone enjoyed the show.

It’s always good to know that your work is done. I look forward to enjoying the rest of TechED and LA.