- Find the most costly statements
- Determine why the statement is costly
- Get an accurate baseline for the procedure
Find the Most Costly Statments
There are two main ways to determine costly queries. One is to execute the procedure including an Actual Execution Plan. The other is to get a time difference of a before and after for each statement. The are also other ways (including IO usage and CPU usage but we won’t cover those here).
I like to start with the execution plan method. Though this is not the most precise method, it is the easiest.. so let’s start there. Here is how to find the slowest query in a procedure using the execution plan.
Determine why the statement is costly
This can be them most difficult task. With the use of the execution plan, we can help deduce the most common issues.
Now that we’ve found the statement, we need to drill down even further to find out what is causing the statement to be slow. This is done by looking at the cost for each operation in the execution plan. Under each operation, you will see a “cost: x%”. You want to find the operation with the highest percentage and focus on that.
When you do find it, many times it will be one of the following operations:
Scan operations are not bad when the cardinality of the column being searched is less than approx 6%, or when the amount of records being scanned is below around 500 records. Otherwise, if you have a scan that has a high cost associated with it, you will want to find the appropriate indexes to fix this issue.
Large Work Tables / Results
This is the issue above that is depicted by thick arrows. When this happens a lot of times it is a bad order of operations. What that means is that the optimizer did not choose the limit the overall result set by joining the smallest tables first. Instead, it is trying to join and filter from the largest tables, and then join the smaller tables. While this may not be the only scenario this happens in, it does happen often.
To fix this, you may consider breaking the query up and dump results into a temp table prior joining on the rest of the tables. Make sure to index your temp table!
RID Lookups / Key Lookups
These are also not always bad. In fact you cannot get around this issue all the time. Why? Because you can only have one clustered index, and you don’t want to include every column in the table in all your non clustered indexes. However if you run into this issue having a high cost, you will want to consider changing your clustered index, or adding the columns being looked up in the Key Lookup using the INCLUDE statement for indexes.
Parallelism can have a significant impact on queries and your server. What this means is that the amount of data being usurped by your query is very large, so in order to speed it up, SQL Server thinks it would be best to create more SPIDs to handle the operation. When this happens look for your page life expectency to dip low and your disk utilization to go high. Consider employing the same technique as above of breaking up your query. Also look at the order of operations and change the query around. If that doesn’t help, add the query hint OPTION (MAXDOP 1). Like this:
FROM dbo.mytable mt
JOIN dbo.yada y
ON mt.ID = y.ID
OPTION (MAXDOP 1)
This guy may fool you when it comes to the cost factor, however it can be a big hindrance to performance when this operation is between table joins. The “scalar” part should worry you, because it does perform somewhat of a row-by-row operation. When this is between table joins, this operation is there to do an implicit conversion of a datatype in order to join the columns correctly. Since it cannot join the two tables directly it will often force too many records to be touched in order to be converted. The conversion may then be dumped into a worktable (in the background) then that worktable will be used for joins after the conversion. Just make sure to join like data types.
Get an Accurate Baseline
Next before you make changes, get an accurate baseline prior to making changes so when you do make changes, you can be sure that they sped things up.
When you execute a procedure, the time it takes to execute is in the lower right-hand corner as shown below.
From the above, we can see the problem with only using the client side timer. It does not show milliseconds, and milliseconds do matter. The other problem that we cannot see above with the client-side timer is that it also includes the round trip time to the server and back to your computer. While this may seem like a more accurate depiction, it makes optimizing harder because you get less consistent results. In my opinion, it’s better to get the server-side timings.
To get the server-side timings, we turn on the time statistics, using set statistics time command.
We enable it using this:
Once executed, it stays on for the entire session (or window) until you close the window or turn it off (by replacing ON with OFF). So you only need to execute this one time, then you can remove the statement.
The next step needed to get consistent results is to clear the caches. One cache holds the compiled version of the SQL Procedure (the execution plan), the other cache holds the data pages that are retrieved from disk.
To clear the compiled execution plan, we use:
To clear the data pages, we use:
Now optimize and apply the changes discovered above! I know, easier said than done sometimes. Even though we did not cover all situations for query slowness, hopefully this will provide a good launching pad. We will delve further into fixes in a later article.