Posts

Smart Date Keys Got Smarter

Using “smart” date surrogate integer keys in the format YYYYMMDD is a dimensional modeling best practice and venerable design technique. To make them smarter though, consider using the Date data type which was introduced in SQL Server 2008. The Date data type stores the date without the time component which is exactly what you need anyway. Using the Date data type instead of integers have the following benefits:

  • The Date data type allows you to perform date arithmetic easier as the date semantics is preserved, so you can use date functions, such as YEAR and MONTH when querying the table or partitioning a cube.
  • Self-service BI users can conveniently filter dates on import. For example, PowerPivot enables relative date filtering, such as Last Month, when the user filters a date column.
  • The Date data type has a storage of three bytes as opposed to four bytes for integers.

The only small downside of using the Date data type is that you will end up with funny looking unique names for the Date members in your Date dimension in an SSAS cube. The unique name will include the time portion, such as [Date].[Date].&[2013-07-01T00:00:00]. If you construct date members dynamically, e.g. to default to the current date, you have to append the time portion or use the Format function, such as Format(Now(),”yyyy-MM-ddT00:00:00″).

Optimizing Massive SQL Joins

Scenario: Run ETL to perform a full data warehouse load. One of the steps requires joining four biggish tables in a stating database with 1:M logical relationships. The tables have the following counts:

VOUCHER: 1,802,743

VOUCHER_LINE: 2,183,469

DISTRIB_LINE: 2,658,726

VCHR_ACCTG_LINE: 10,242,414

Observations: On the development server, the SELECT query runs for hours. However, on the UAT server it finished within a few minutes. Both servers have the same data and hardware configuration, running SQL Server 2012 SP1.

Solution: Isolating the issue and coming up with a solution wasn’t easy. Once we ruled out resource bottlenecks (both servers have similar configuration and similar I/O throughput), we took a look at the estimated query plan (we couldn’t compare with the actual execution plan because we couldn’t wait for the query to finish on the slow server).

We’ve notice that the query plans were very different between the two servers. Specifically, the estimated query plan on the fast server included parallelism and hash match join predicates. However, the slow server had merge M:M join predicates. This requires a tempdb work table for inner side rewinds which surely can cause performance degradation.

030313_0112_OptimizingM1

Interestingly, the cardinality of the tables and estimated number of rows didn’t change much between the two plans. Yet, the query optimizer decided to choose very different plans. At this point, we figured that this could be an issue with statistics although both servers were configured to auto update statistics (to auto-update statistics SQL Server requires modifications to at least 20% of the rows in that table). The statistics on the slow server probably just happened to have a sample distribution that led to a particular path through the optimizer that ended up choosing a serial plan instead of a parallel plan. Initially, we tried sp_updatestats but we didn’t get an improvement. Then, we did Update Statistics <table name> With Fullscan on the four tables. This resolved the issue and the query on the slow server executed in par with the query on the fast server.

Note: Updating statistics with full scan is an expensive operation that probably shouldn’t be in your database maintenance plan. Instead, consider:

1. Stick with default sampled statistics

2. Try hints for specific queries that exhibit slow performance, such as OPTION (HASH JOIN, LOOP JOIN) to preclude the expensive merge joins.

Special thanks to fellow SQL Server MVPs, Magi Naumova, Paul White, Hugo Kornelis, and Erland Sommarskog for shedding light in dark places!

Gartner’s 2013 Data Warehouse Magic Quadrant

MVP fellow, Andrew Burst, discusses Gartner’s newly released DW (not BI, mind you) magic quadrant. It looks like Gartner fell in love with Teradata but Microsoft is moving up as well thanks to the advances in its Parallel Data Warehouse (PDW) and in-memory offerings.

1106.gartnerBI2013.png-550x0

 

Analysis Services Processing and CXPACKET Waits

Performance tuning – my favorite! This blog originated from a quest to reduce the processing time of an SSAS cube which loads some 2.5 billion rows and includes DISCINTCT COUNT measure groups. The initial time to fully process the cube was about 50 minutes on a dedicated DELL PowerEdge R810 server, with 256 GB RAM and two physical processors (32 cores total). Both the SSAS and database servers were underutilizing the CPU resources with SSAS about 60-70 utilizations and the database server about 20-30 CPU utilization. What was the bottleneck?

By using the sys.dm_os_waiting_tasks DMV like the statement below (you can use also the SQL Server Activity Monitor), we saw a high number of CXPACKET wait types.

SELECT
dm_ws.wait_duration_ms,

dm_ws.wait_type,

dm_es.status,

dm_t.TEXT,

dm_qp.query_plan,

dm_ws.session_ID,

dm_es.cpu_time,

dm_es.memory_usage,

dm_es.logical_reads,

dm_es.total_elapsed_time,

dm_es.program_name,

DB_NAME(dm_r.database_id) DatabaseName,

— Optional columns

dm_ws.blocking_session_id,

dm_r.wait_resource,

dm_es.login_name,

dm_r.command,

dm_r.last_wait_type

FROM
sys.dm_os_waiting_tasks
dm_ws

INNER
JOIN
sys.dm_exec_requests
dm_r
ON
dm_ws.session_id = dm_r.session_id

INNER
JOIN
sys.dm_exec_sessions
dm_es
ON
dm_es.session_id = dm_r.session_id

CROSS
APPLY
sys.dm_exec_sql_text
(dm_r.sql_handle)
dm_t

CROSS
APPLY
sys.dm_exec_query_plan
(dm_r.plan_handle) dm_qp

WHERE
dm_es.is_user_process = 1

The typical advice given to address CXPACKET waits is to decrease the SQL parallelism by using the MAXDOP setting. This might help in some isolated scenarios, such as UPDATE or DELETE queries. However, the SQL Sentry Plan Explorer showed that each processing query is highly parallelized to utilize all cores. Notice in the screenshot below, that thread 16 fetches only 14,803 rows.

051512_0211_AnalysisSer1

Therefore, the CXPACKET waits were simply caused by faster threads waiting for other threads to finish. In other words, CXPACKET wait is just a coordination mechanism between the threads being parallelized. To confirm this, we set the SQL Server MAXDOP setting to 1. Surely, the CXPACKET waits disappeared but the overall cube processing time went up as well. In our case, the biggest benefit was realized not by decreasing the SQL Server parallelism but by increasing it, by increasing the maximum number of database connections. This resulted in decreasing the overall processing time some 20%.

051512_0211_AnalysisSer2

You need to be careful here though. While increasing the connections to max out the CPU on the SSAS server will yield the biggest gain, it might also slow down other processing, such as reports that query the cube while the database is being processed. So, as a rule of thumb, target no more than 80% CPU utilization to leave room for other tasks.

Columnstore Indexes To Speed ETL

SQL Server 2012 introduces columnstore indexes. Using the same in-memory VertiPaq engine that powers PowerPivot and Analysis Services Tabular, columnstore indexes can speed up dramatically SQL queries that aggregate large datasets. For a great introduction to columnstore indexes, see the video presentation, “Columnstore Indexes Unveiled” by Eric Hanson. I personally don’t see columnstore indexes as a replacement of Analysis Services because an analytical layer has much more to offer than just better performance. However, in a recent project we’ve found a great use of columnstore indexes to speed up ETL processes.

Issue: Perform an initial load of a snapshot fact table for inventory analysis from another fact table with one billion rows. For each day, extract some 200 million rows from the source fact table and group these rows into a resulting set of about 300,000 rows to load the snapshot fact table for that day. The initial estimates indicated that that the extraction query alone takes about 15 minutes when using a clustered index. And, that’s just for one day. Given this speed, we estimated the initial load could take weeks.

Solution: We upgraded to SQL Server 2012 and created a columnstore index on selected columns from the source fact table. We excluded high-cardinality columns that were not used by the extraction query to reduce the size of the index. Creating the index on the source table (1 billion rows) took about 10 minutes and this is very impressive. The disk footprint of the index was about 4GB. We ran the same extraction query and saw a five-fold performance improvement. This query, which would previously run for 15 minutes with a B-tree clustered index, would now finish in 3 minutes with a columnstore index.

When testing your queries with columnstore indexes, it is important to make sure that the query executes in Batch mode. As you can see in the screenshot below, the query uses the columnstore index and the execution mode is Batch. If it says Row, the query performance degrades significantly. Watch Eric Hanson’s presentation to understand why this happens and possible workarounds.

120811_0048_Columnstore1

When I did initial tests to test the columnstore index, I ran into a gotcha. I tested a simple “SELECT SUM(X) FROM TABLE” query to find that it executes in a Row mode and the query took about 2 minutes to finish. As usual, the first thing I try doesn’t work. As it turned out, currently a columnstore index doesn’t support batch mode with scalar aggregates. You need to rewrite the query with a GROUP BY as Eric Hanson explains this in more details in his blog, “Perform Scalar Aggregates and Still get the Benefit of Batch Processing”. This is rather unfortunate because every ad-hoc report starts with the end user dropping a measure and the report tool generating such queries.

When testing the performance gain, it’s useful to compare how the same query would perform without a columnstore index. Instead of having two tables or dropping the index, you could simply tell SQL Server to ignore the columnstore index, such as:

SELECT … FROM…WHERE…GROUP BY…

OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX)

 

To sum up, when you are pushing the envelope of traditional B-tree indexes and queries aggregate data, consider SQL Server 2012 columnstore indexes. The query optimizer would automatically favor a columnstore index when it makes sense to use it. Columnstore indexes require a read-only table but the cost of dropping and recreating them is not that high. Or, you could add them to speed up the initial DW load and drop them once the load completes.

 

Disabling Foreign Key Constraints

Disabling check constraints, such as foreign key constraints, is often required when populating a data warehouse. For example, you might want to disable a check constraint to speed up loading of a fact table.

  1. How do we disable check constraints in SQL Server?
    You can script each check constraint or you can use the undocumented sp_MSforeachtable function.

    ALTER
    TABLE [dbo].[FACT_ITEM_INVENTORY] NOCHECK
    CONSTRAINT [FK_DIM_DATE_FACT_ITEM_INVENTORY_Date_Id] — disables a specific contraint

    sp_MSforeachtable
    ‘ALTER TABLE ? NOCHECK CONSTRAINT ALL’ — disables all contraints in the database

    Once the foreign keys are disabled, you can run the ETL job or delete records from the referenced table without referential integrity checks. Note that you cannot truncate a table even if you have disabled the constraints due to the different way SQL Server logs the truncate operation. You must drop the constraints if you want to truncate a table. You can use sp_MSforeachtable to drop constraints as well although I personally prefer to disable them for a reason that will become obvious in a moment.

  2. How do we enable check constraints?

The same way you disabled them but this time you use the CHECK predicate:

ALTER
TABLE [dbo].[FACT_ITEM_INVENTORY] CHECK
CONSTRAINT [FK_DIM_DATE_FACT_ITEM_INVENTORY_Date_Id] — enables a specific constraint

sp_MSforeachtable
‘ALTER TABLE ? CHECK CONSTRAINT ALL’ — enables all contraints in the database

  1. Does SQL Server verify referential integrity for existing data when you re-enable check constraints?

    This is where the behavior differs between recreating and enabling constraints. If you drop and create the constraints, the server will check the data unless WITH NOCHECK is used.

    ALTER
    TABLE [dbo].[FACT_ITEM_INVENTORY] WITH
    NOCHECK

    ADD
    CONSTRAINT [FK_DIM_DATE_FACT_ITEM_INVENTORY_Date_Id] FOREIGN
    KEY([Date_Id]) REFERENCES [dbo].[DIM_DATE] ([Date_Id])

    The server doesn’t check existing data if you enable a previously disabled constraint. You can manually check foreign key violations by using this command:

    dbcc
    checkconstraints(FACT_ITEM_INVENTORY) –- checks a specific table

    dbcc
    checkconstraints — checks the entire database

The checkconstraints command will output all rows that violate constraints.

UPDATE 10/31/2011

Reader Ian pointed out that re-enabling foreign key constraints without checking data marks them as non-trusted. Consequently, the SQL Server optimizer may choose different execution plans for a certain range of queries. Tibor Karaszi explains this in more details in his blog Non-trusted constraints and performance. Paul White also mentions about a bug in this area. This makes me believe that enabling check constraints without checking data should be avoided. Therefore, to speed up fact table imports for large dataset consider:

  1. Disable foreign key constraints
  2. Drop indexes on fact tables
  3. Import data
  4. Recreate indexes on fact tables
  5. Re-enable constraints with the WITH CHECK option


sp_MSforeachtable

‘ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL’ — enables all constraints in the database