Shared Datasets
When the Reporting Service team asked my opinion about shared datasets, a new feature in the forthcoming SQL Server Reporting Services R2, I was somewhat skeptical. I preferred them to focus on more important in my mind features, such as the ability to join datasets at report level. But the more I look at the way shared datasets got implemented, the more real-life scenarios I think may benefit from this enhancement.
Think of a shared dataset a hybrid between a shared data source and report execution. Similar to a shared data source, a shared dataset is a report dataset that can be managed independently and shared among reports. A shared dataset must use a shared data source. The shared datasets can be parameterized and reports that use it can pass parameters to it. Similar to report executions, a shared dataset can also be cached and refreshed on a schedule. In my opinion, there are two main scenarios where shared datasets can be useful:
- Easier maintenance – Suppose you have a dataset that you can re-use across reports, such as a dataset that populates the parameter available values. You want all reports to reuse the dataset and pick up the dataset query changes automatically.
- Improved report performance – You may have query that takes very long to execute. You want to execute the query in an unattended mode, such as outside working hours, perhaps for different set of parameters, and cache the dataset for a specific duration.
Creating Shared Datasets
Creating a shared dataset is easy. Both BIDS Report Designer and Report Builder 3.0 are capable of creating shared datasets. In Report Designer, you right-click on the new Shared Datasets folder in Solution Explorer, and click Add New Dataset. In Report Builder 3.0, you click the Create Dataset option on the main Create Report or Dataset popup screen.
You can easily convert a report-specific dataset to a shared dataset. Just right-click any dataset in Report Data Window and click Convert to Shared Dataset. Consequently, Report Designer will re-factor the dataset as a shared and it will add its definition to as an *.rsd file to the Shared Datasets folder in Solution Explorer. In the screenshot below, I have converted the EmployeeSalesDetails dataset in the Employee Sales Summary 2008 report to a shared dataset. Notice that the EmployeeSalesDetails dataset reference inside the report has a special icon.
As I mentioned in a previous post, the project properties dialog get enhanced to support new deployment settings that are specific to shared datasets. Specifically, the OverwriteDatasets setting specifies if the shared dataset definition will overwritten on deployment if it exists and the TargetDatasetFolder specified in which folder the dataset definitions will be deployed to.
Once the dataset is configured as shared, you can set up its properties, which include the dataset query, fields, and filters. Then, you can configure the dataset reference inside the report to pass parameters to the shared dataset or to have its own filters. This is similar to how to you configure a subreport. What if you want to use row-level security and you need to pass the user identify to the dataset query (User!UserID)? Just add a query parameter to the shared dataset, such as LoginID, and configure the dataset reference inside the report to pass User!UserID as a parameter to the shared dataset.
Managing Shared Datasets
Once the shared dataset is deployed, it can be managed just like any other published item using Report Manager or SharePoint. Common management tasks include setting the data source reference, caching, and security. The most interesting of these is caching. Just like report executions, you can cache a shared dataset. When a shared dataset is configured for caching, the report server will cache a dataset copy for each parameter combination. The same restrictions apply as with caching report executions. Specifically, the data source cannot use Windows Security or Prompt for Credentials authentication options. You can configure cache expiration options. For example, if data latency of 30 minutes is acceptable, you can configure the dataset cache to expire in 30 minutes.
Let’s say you a dataset query that takes very long to execute. You can set up a cache refresh plan to warm up the dataset cache on a set schedule. Cache refresh plans are also a new R2 feature. Previously, you pre-execute reports by creating a subscription using the NULL delivery provider. Moving to R2, you don’t need to use the NULL provider anymore and you can refresh shared datasets independently from reports. In the process of setting up a cache refresh plan, you need to specify an item-specific or shared schedule and default values for each parameter, just like you would do with snapshot caching or subscriptions.
As you can see, shared datasets can help you implement some interesting scenarios when you need to reduce the management effort or improve the report performance.