Prologika Forums
Making sense of data
Optimizing Dimension Data Security

Blogs

Prologika (Teo Lachev's Weblog)

Books

Applied Microsoft SQL Server 2012 Analysis Services (Tabular Modeling)Learn PowerPivot and Analysis Services-Tabular at your own pace with our latest book Applied Microsoft SQL Server 2012 Analysis Services (Tabular Modeling). It is designed as an easy-to-follow guide for learning how to implement BI solutions spanning the entire personal-team-organizational BI spectrum.

Training

Applied Microsoft SQL Server 2008 Reporting Services

We offer onsite and online Business Intelligence classes!  Contact us about in-person training for groups of five or more students.

For more information or to register click here! 

Syndication

Archives

Dimension data security is a great feature of Analysis Services and one of the top reasons to strongly consider a semantic layer in your BI implementations, as I mentioned in my "Why Semantic Layer" newsletter. That's because the server automatically secures the cube at connect time so you don't have to implement your own security framework and secure every report query. If you are new to dimension data security, I covered it in various places including this article that I wrote for SQL Magazine a while back. One thing to be aware of is that the server applies dimension data security when the user sends the first query after the connection is established even if this query doesn't specifically request the secured dimension, e.g. SELECT NULL ON 0 FROM [CubeName]. Many things happen at this point (see the steps after "Dimension security is applied to all attributes" in this Mosha blog). This initialization sequence is the price you need to pay to have your entire cube secured. The initialization duration depends on many factors, including the size of the dimension, the size of the allowed set, and the cube script. However, you have some control over the initial performance hit or at least to offset it.

Let's consider a real-life scenario. A customer has a medium-size cube (about 50 GB) and dimension data security needs to be applied to an Account dimension whose dimension key attribute had about four million members. Specifically, the allowed set had to be applied on an attribute that had about 300,000 members. In this case, the security policies are stored in a factless fact table (for more information how dimension data security works with a factless fact table, see the second part of the above article). The initial initialization time for an allowed set of only two accounts was 9 seconds which is not so bad. However, if the allowed set return 20,000 allowed accounts, the user had to wait for 30 seconds to connect which can surely lead to user complaints.

Reducing the Autoexists Performance Hit

When the server applied dimension data security, it applies it to all attributes within the secured dimension. This behavior is known as Autoexists. In this case, the Account dimension had about 30 attributes. Although dimension security is applied on a single attribute (Level 3), the server automatically secures the rest of the attributes to allow only the members that exist with the allowed members of the secured attribute (if the allowed set returns 'USA', Autoexists allows only cities within USA). One way to determine how much time the server spends applying Autoexists is to enable and monitor in the Query Dimension event in the Profiler. Another way would be to remove all the cube script (except the first CALCULATE statement) and track the query execution time before and after.

As I said, the larger the allowed set and the dimension size, the more time will be spent in propagating security to the attributes in the secured dimension. To reduce this time, consider introducing a new dimension, e.g. Account Filter, that has as fewer attributes as possible. Ideally, it should include only the secured attribute although you might need to include the dimension key attribute so you don't have to make changes to the fact tables in order to join them to the dimension at a higher level. In our case, we tested three scenarios:

Scenario 1: Security applied on the original dimension

An allowed set of two members took 1.7 seconds in Autoexists. An allowed set 20,000 members took 13 seconds in Autoexists.

Scenario 2: Security applied on a new Account Filter dimension that had three attributes (dimension key, secured attribute, and another attribute required to meet the business requirements for security).

An allowed set of two members took 1.5 seconds in Autoexists. An allowed set 20,000 members took 3.5 seconds in Autoexists.

Scenario 3: Security applied on a new Account Filter dimension that had two attributes (secured attribute and another attribute required to meet the business requirements for security). This scenario required changes to the fact tables to bring in the secured attribute in order to join them to the Account Filter dimension.

An allowed set of two members took 1.2 seconds in Autoexists. An allowed set 20,000 members took 1.7 seconds in Autoexists.

Optimizing the Cube Script

This cube had about 3,000 lines, most of them basic calculated members (a best practice is to have calculations defined in the semantic layer instead of reports or elsewhere). The items to pay attention to in the script are static sets and default members (default members can be defined in the dimension itself or in the script). A static set is evaluated once per the user session. With regards to dimension data security, we've found that the server evaluates the static set in the context of the allowed members. If you turn the Execute MDX Script Current event to see how much time the server spends evaluating each line in the script. In our case, I've found that the following static set took some 10 seconds to evaluate with an allowed set of 20,000 members:

CREATE SET [MonthsWithInventory] AS Exists([Date].[Fiscal by Month].[Month].Members, , "Inventory");  

When I converted the set to dynamic (CREATE DYNAMIC SET), the overhead disappeared from the initialization time. However, be aware that during the static set evaluation the server warms the cache with the results of the static set. So switching to a dynamic set really saves you the cost of evaluating the Exists query which in this case was expensive. Consequently, the first query will execute faster. But it obviously means that the cache that would have been built up as part of evaluating that set is now not available for other operations. Therefore, subsequent queries that requests data by month might be slower.

What I've found also is that overwriting the default members contributes to the initialization times. If you have an option, remove explicit attribute defaults. Interestingly, trying to conditionally bypass the defaults didn't eliminate the overhead, such as using the following approach:

ALTER CUBE CURRENTCUBE UPDATE DIMENSION [Account].[Active Flag], Default_Member = iif(IsEmpty(CustomData()), [Account].[Active Flag].&[Active], [Account].[Active Flag].DefaultMember);

 

Special thanks to Akshai Mirchandani on the SSAS team for shedding light in dark places.


Posted Tue, Oct 29 2013 7:52 by tlachev
Filed under: