What is Parallelism?

Posted on November 18, 2010 by Derek Dieter
4

Parallelism is a feature in SQL Server which allows expensive queries to utilize more threads in order to complete quicker. The query optimizer makes the determination of how expensive a query is based upon the Cost Threshold for Parallelism setting set at the SQL Server Instance level. This cost threshold for parallelism is an estimate (roughly defined in seconds) that the query optimizer has determined a statement will take based on an estimated (or cached) execution plan. Generally the queries that qualify for parallelism are high IO queries. In a normal process, a source is read using a single SPID (Server Process ID) and it output using the same SPID. In parallelism, multiple SPIDs are used to read a source (this is known as distributing streams), then an operation may be performed in the streams, then the streams are gathered. Below is an illustration of this: Although parallelism may seem like a good idea, often times it is not. For OLTP systems that facilitate a lot of user requests parallelism is usually an indication of poorly written queries and/or queries that are in need of indexes. It can also cause issues on servers that have a lot of processors that also have disk IO contention. The reason for this is because parallelism will by default spawn as many SPIDs as their are processors. In other words, if you have a rogue query joining a hundred million records that does not qualify for parallelism, then by itself, only one process will […]

SQL Current Date

Posted on November 14, 2010 by Derek Dieter
No

To get the current date in SQL Server use the GETDATE() function. Here is an example: [cc lang=”sql”] SELECT CurrentDate = GETDATE() [/cc] GetDate() returns a datetime data type value. There are also other methods in SQL Server that will return the current date. These are less commonly used. In fact, I have never personally had a need to use any of them except GETDATE(). [cc lang=”sql”] SELECT SYSDATETIME() SELECT SYSDATETIMEOFFSET() SELECT SYSUTCDATETIME() SELECT CURRENT_TIMESTAMP SELECT GETDATE() SELECT GETUTCDATE() [/cc]

KEEPFIXED PLAN Query Hint

Posted on November 5, 2010 by Derek Dieter
1

The KEEPFIXED PLAN query hint is used in order to retain the original execution plan used to initially compile a statement. It is typically used to disallow a recompile, especially in regards to a statement that utilizes temp table or a table that changes it’s percentage of records frequently resulting in triggering a recompile for a statement. To use, specify OPTION (KEEPFIXED PLAN) immediately after the statement that you do not want to recompile. [cc lang=”sql”] SELECT * FROM #Temp OPTION (KEEPFIXED PLAN) [/cc] Temp tables frequently recompile due to the change in statistics. When using temp tables within stored procedures, this can be a disadvantage. To get around the recompile, either use table variables (indexed with constraints) or use the KEEPFIXED PLAN query hint. Recompiles typically happen when the percentage of a tables (or temp tables) rows change by 500 and the cardinality (or uniqueness) changes by 20%.

MAXDOP Query Hint

Posted on November 5, 2010 by Derek Dieter
5

The MAXDOP query hint is used to implicitly specify the maximum degree of parallelism to use for a specific statement. The degree of parallelism is roughly defined as how many separate processors to utilize for a single query. To set this, use the following convention: [cc lang=”sql”] SELECT * FROM Sales.SalesOrderDetail OPTION (MAXDOP 1) [/cc] This will force only one instance of the SPID to be spawned. To allow more instances of a SPID to be spawned replace the 1 with the maximum number. This needs to be specified after each statement. It is not specific to the statement. As with all query hints, it is recommended that you know internally what you are controlling by specifying this hint. While this specific hint is relatively harmless if you specify “1” it is still recommended you understand it’s effects.

Configuring Parallelism

Posted on November 4, 2010 by Derek Dieter
No

Parallelism is a powerful feature in SQL Server designed to allow greater bandwidth for high impact queries. Most of the time however, I’ve seen too many queries qualifying for parallelism and essentially bringing the SQL Server box to it’s knees. To understand parallelism, we need to understand SPIDs (Server Process ID’s). These SPID’s act essentially as threads on a SQL Server. Normally there is one SPID assigned to each database connection. Sometimes, if the query is determined to be expensive, a SPID can be split out into multiple execution threads. This is known as parallelism. Cost Threshold for Parallelism Prior to breaking a SPID out into multiple threads, there is a cost determination. This is called the cost threshold for parallelism (set in seconds) and is set at the server-level. The idea is to allow parallelism when a query is estimated to run over a certain number of seconds. Generally, the determining factor is the estimated number of rows that will be returned either in the main query or an inner query within the execution plan. To set parallelism right click on the Server Instance in Management Studio. And select “properties”. Once properties is selected, a new window pops up displaying a more detailed server settings. We want to select on “Advanced” Now, in the lower middle of the screen we see the settings for Parallelism. The main two we are concerned with the is the cost threshold and the Max Degree of parallelism. What Settings to Choose? You need […]

What is Parallelism?

SQL Current Date

KEEPFIXED PLAN Query Hint

MAXDOP Query Hint

Configuring Parallelism

Featured Articles

Site Author

Resources