SQL Server’s Batch Requests represents the number of SQL Statements that are being executed per second. In my opinion this is the one metric that gives the best baseline on how your server is performing. It cannot be looked at by itself however. It is necessary to correlate with other metrics (especially CPU usage) in order to get an overall understanding of how your server is performing. The overall goal of any SQL Server is to be able to achieve the most batch requests/sec while utilizing the least amount of resources. On some busy machines I’ve worked with, this counter averaged around 25,000 batch requests/sec during peak time. This peak throughput is heavily dependent on the architectural design of the system, including procedures, tables, and index design. One notable example of this was a system I worked with whose procedures were written using loops (ugh).. The average batch requests we could ever achieve was only 200. I was very surprised to initially see this, however digging deeper into the code I became less and less shocked. Ultimately I found one piece of code that affected the throughput of the entire system. It was a scalar UDF defined as a computed column in the busiest table on the system (don’t get me started). Anyway, after rewriting that one process, the system then found batch request peaks that went over 3500! The fact that the system could achieve that number now was a big achievement. It meant the overall throughput of the […]
Continue reading ...
In order to fix deadlocks, it’s critical to understand why they occur. I’ve gotten some push back sometimes from readers when I write articles that do not give cookie cutter solutions, but the “teach a man to fish” adage couldn’t be more true for deadlocks. All of it revolves around locking, which is a central piece to SQL Server’s default transaction isolation level of read-committed. Now what is read committed? Some people are critical of this isolation level because with a different isolation level like Snapshot, it is much less likely that deadlocks occur because each transaction utilizes it’s own version of what a table looks like at the very time the transaction begins. There are trade-offs with shapshot however. Snapshot isolation is much more disk intensive because these “snapshots” need to be persisted in a temporary data store. So if you are ever considering enabling snapshot isolation as your default isolation level for SQL Server you will want to make sure you have a beefy RAID 10 tempdb storage. So with read-committed we have less tempdb data swapping, but more locking. There are many different kinds of locks and many different objects that can get locked. Each of these types of locks may or may not be compatible with other types of locks. Locks are acquired so that users have a consistent view of what the data looks like at the time they are querying it. Imagine user 1 begins updating a table to inactivate all the records while […]
Continue reading ...
If you write TSQL then comparing the performance of two queries will probably be something you do on a daily basis. The difficult part of comparing queries is getting an accurate baseline. Why is this you ask? Because SQL Server has behind the scenes functionality that optimizes queries for multiple executions. This allows minimal resources the second time a query is run. This is why often times, you will notice the second run can be considerably faster than the first. There are a few features that make this happen. Plan Caching Immediately prior to the execution of a query, SQL Server needs to determine the best route to take for the query. This includes the indexes to use, the order of the tables to reference, the types of join algorithms to perform, and many other factors. Finding this route is a rather expensive process, so instead of performing this every time a query is ran, SQL Server caches the plan in memory so it can be reused. Naturally, the second time the query is run, it will be quicker because it does not have to determine the execution plan. There are two ways to clear a plan cache. You can nuke all plans on the instance using DBCC FREEPROCCACHE with no parameters. Or you can pass in the parameter of the plan handle to clear. Just running the following command will clear the plan cache on the entire instance [cc lang=”sql”] — Clear the instance DBCC FREEPROCCACHE [/cc] This will […]
Continue reading ...
If you read execution plans enough, you’ve probably realized that when SQL Server joins tables together, it uses different internal algorithms. The three algorithms are: Loop Join Merge Join Hash Join These alogorithms that are used are based upon factors of the underlying data. Merge Join For the most part, this is the most efficient method of joining tables. As the name implies, both tables are essentially merged together, much like a zipper being zipped. It typically occurs when both tables that are being joined, are joined on keys that are presorted and contain all the same keys in both tables (for example joining a primary key with a foreign key). When one table contains keys that the other table does not have, the chance of the merge join being used is less likely. The physical profile of a merge join is very little CPU usage, and very little reads compared to other types of joins. Loop Join The loop join is more CPU intensive than a merge join. This join typically occurs when one worktable is quite a bit smaller than the other. As the word loop implies, the smaller table being joined is looped until it finds the matching key in the outer (larger) table. This join is most efficient when the resulting output is smaller than 5000 rows. When larger than that, the CPU and reads make the join less efficient. Hash Join A hash join is the least efficient of all joins, however that does not […]
Continue reading ...
Parallelism is a feature in SQL Server which allows expensive queries to utilize more threads in order to complete quicker. The query optimizer makes the determination of how expensive a query is based upon the Cost Threshold for Parallelism setting set at the SQL Server Instance level. This cost threshold for parallelism is an estimate (roughly defined in seconds) that the query optimizer has determined a statement will take based on an estimated (or cached) execution plan. Generally the queries that qualify for parallelism are high IO queries. In a normal process, a source is read using a single SPID (Server Process ID) and it output using the same SPID. In parallelism, multiple SPIDs are used to read a source (this is known as distributing streams), then an operation may be performed in the streams, then the streams are gathered. Below is an illustration of this: Although parallelism may seem like a good idea, often times it is not. For OLTP systems that facilitate a lot of user requests parallelism is usually an indication of poorly written queries and/or queries that are in need of indexes. It can also cause issues on servers that have a lot of processors that also have disk IO contention. The reason for this is because parallelism will by default spawn as many SPIDs as their are processors. In other words, if you have a rogue query joining a hundred million records that does not qualify for parallelism, then by itself, only one process will […]
Continue reading ...