Overcoming Issues with TempDB Contention
-
Posted on November 6, 2014 by Derek Dieter
-
2
One of the possible bottlenecks for a busy SQL Server is contention in TempDB. This symptoms of this bottleneck are mostly excess wait_types of PAGELATCHEX that are blocking one another. (you can use sp_who3 in order to see whether those wait types are popping up during your busy times). The cause of this is either implicit or explicit use of TempDB. Explicit being creation of table variables and temp tables, or implicit being worktables being generated by complex execution plans.
Luckily there are a number of steps you can take in order to overcome these issues.
1) Enable traceflag 1118
There is some argument as to whether this trace flag actually makes a difference. first of all, MS SQL CSS engineers state it’s already the default for 2008. What the flag does is immediately allocate space for tempdb tables upon creation rather than on insert (if I understand it correctly). It did seem to help our database during one point of contention, but I’m not willing to test it again by turning it off. So with that said, “it can’t hurt to turn it on” at least that’s that the consensus of what I’ve read.
2) Put TempDB on a PCIE SSD
The PCIE cards are getting cheaper every month. Just putting this in, reduced our write speed from 2.3ms to .3ms. What that translates into is, it’s quicker to get “in and out” of tempdb allowing less contention and higher throughput. Also while I’m too lazy to prove it, 2ms of increased write speed at the disk level equates to probably 15ms at the query level. 15ms at the query level equates to a 30ms speed up at the web layer. Yes I am too lazy to prove that, and go ahead and call me on it, but I do believe that to be true.
3) Evaluate your high tempdb usage queries
Ah, why did I put this 3rd? Because I believe you should have the two above done by default anyway. No reason not to. But, let’s not avoid what IS the source of the issue. With that said, you want to look at the queries that have a high amount of total writes overall and figure out why they are writing a lot. As said before, tempdb writes come in a few different flavors. First is explicit writes – meaning creation of temp tables and table variables. Next are implicit writes, these are typically from horrible queries (or maybe just ones that contain certain operations) or it can be because you have enabled snapshot isolation. So you need to determine these and have them write as little as possible. In order find the queries that implictly write you will have to replicate the query and turn on SET STATISTICS IO ON. From there look to see activity of WORKTABLE in the Messages tab of SSMS. Hint: WORKTABLE=TempDB. Here’s the query to find the biggest contributors that write:
[cc lang=”sql”]
SELECT TOP 100 *
FROM
(
SELECT
DatabaseName = DB_NAME(qt.dbid)
,ObjectName = OBJECT_SCHEMA_NAME(qt.objectid,dbid) + ‘.’ + OBJECT_NAME(qt.objectid, qt.dbid)
,DiskReads = SUM(qs.total_physical_reads) — The worst reads, disk reads
,MemoryReads = SUM(qs.total_logical_reads) –Logical Reads are memory reads
,Total_IO_Reads = SUM(qs.total_physical_reads + qs.total_logical_reads)
,Executions = SUM(qs.execution_count)
,IO_Per_Execution = SUM((qs.total_physical_reads + qs.total_logical_reads) / qs.execution_count)
,CPUTime = SUM(qs.total_worker_time)
,DiskWaitAndCPUTime = SUM(qs.total_elapsed_time)
,MemoryWrites = SUM(qs.max_logical_writes)
,totalLogicalwrites = SUM(qs.total_logical_writes)
,Write_Per_Execution= SUM((qs.total_logical_writes) / qs.execution_count)
,DateLastExecuted = MAX(qs.last_execution_time)
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
GROUP BY DB_NAME(qt.dbid), OBJECT_SCHEMA_NAME(qt.objectid,dbid) + ‘.’ + OBJECT_NAME(qt.objectid, qt.dbid)
) T
ORDER BY totalLogicalwrites DESC
[/cc]
Keep in mind this query will shows the descending order of the procedures that are doing the most writes. That does NOT mean they are writing to TempDB. You will have to weed through them (statements and all) to find out why they are writing a lot.
However another very useful tool, is Adam Machanic’s sp_whoisactive procedure. It will show you in real-time what procedures are writing to tempdb and how much.
Now that you’ve found the procedures, comes the fun part, re-writing them. That’s more of a multifaceted “can’t provide much info without seeing it” issue. What I generally can say is don’t write a lot where you can help it. Sometimes there’s no getting around it however. Making procedures fast and reliable a lot of times takes explicit TempDB writes, but in these cases you want to ensure you limit the result set as much as possible.
4) increase the number of TempDB data files in blocks of 4
The old rule of thumb used to be to have the same number of data files as you have cores. But with many multiprocessor hyperthreaded machines exist, you don’t want to go making 64 tempdb data files. Not to mention there is a point of diminishing returns on their creation because they are written to in a round robin approach (not in parallel) so there’s probably some slowness on figuring out where the table was created (I’m guessing). I would say on a 64 core box, it might not be uncommon to have 24 files created on a very busy box. It is also important that are all created at the same size and give enough space for a single one of them to handle a very large query.
- Comments (RSS)
- Trackback
- Permalink