Performance tuning – my favorite! This blog originated from a quest to reduce the processing time of an SSAS cube which loads some 2.5 billion rows and includes DISCINTCT COUNT measure groups. The initial time to fully process the cube was about 50 minutes on a dedicated DELL PowerEdge R810 server, with 256 GB RAM and two physical processors (32 cores total). Both the SSAS and database servers were underutilizing the CPU resources with SSAS about 60-70 utilizations and the database server about 20-30 CPU utilization. What was the bottleneck?
By using the sys.dm_os_waiting_tasks DMV like the statement below (you can use also the SQL Server Activity Monitor), we saw a high number of CXPACKET wait types.
-- Optional columns
dm_ws.session_id = dm_r.session_id
dm_es.session_id = dm_r.session_id
dm_es.is_user_process = 1
The typical advice given to address CXPACKET waits is to decrease the SQL parallelism by using the MAXDOP setting. This might help in some isolated scenarios, such as UPDATE or DELETE queries. However, the SQL Sentry Plan Explorer showed that each processing query is highly parallelized to utilize all cores. Notice in the screenshot below, that thread 16 fetches only 14,803 rows.
Therefore, the CXPACKET waits were simply caused by faster threads waiting for other threads to finish. In other words, CXPACKET wait is just a coordination mechanism between the threads being parallelized. To confirm this, we set the SQL Server MAXDOP setting to 1. Surely, the CXPACKET waits disappeared but the overall cube processing time went up as well. In our case, the biggest benefit was realized not by decreasing the SQL Server parallelism but by increasing it, by increasing the maximum number of database connections. This resulted in decreasing the overall processing time some 20%.
You need to be careful here though. While increasing the connections to max out the CPU on the SSAS server will yield the biggest gain, it might also slow down other processing, such as reports that query the cube while the database is being processed. So, as a rule of thumb, target no more than 80% CPU utilization to leave room for other tasks.
Mon, May 14 2012 10:12 PM