The debate whether to use temp tables or table variables is an old debate that goes back since they were first introduced. The real answer to knowing the difference lies in what is going on under the hood and correlating those specifics to your situation. In this article we’ll touch on (hopefully all) the differences between the two. #Temp tables are just regular SQL tables that are defined and stored in TempDB. The only difference between them and a permanent table is they are not allowed to have foreign keys. You can view the temp tables currently defined via SSMS by going to TempDB and Expanding Temp Tables. Let’s look at a matrix of specific differences then we’ll touch on a few of the key differences below. Item #Temp Tables @Table Variables Can participate in a transaction Writes to Log File Writes only to memory (not disk) Can Qualify for Parallelism Allows creation of statistics Does not affect recompilation Allows nonclustered indexes Allows clustered indexes Can perform SELECT INTO Can access in nested stored procedures Can define globally Can use in user defined functions Can insert from EXEC Allows TRUNCATE Allows ALTER TABLE Scope #Temp tables can be either #locally defined (within the scope of the session) or ##globally defined (within a database). When they are locally defined, the table can be used by whatever is executed within that session (or SPID). In other words, if you define a #temp table in procedure A, then procedure A calls procedure B, […]
Continue reading ...
If you’re like me, after using UNION for many years you may have stumbled upon the UNION ALL and thought, what the heck is the difference? Well it’s pretty simple. UNION performs a distinct and gives you the unique results for all UNION’ed queries, whereas UNION ALL does not perform a distinct and gives you all rows regardless of whether they are duplicate. The performance difference is that UNION ALL can be much faster since distinct is an expensive operation. Here is a simple example detailing the difference. We’ll populate a single temp table with one record, then we’ll do both a UNION, and a UNION ALL to see the results. [cc lang=”sql”] — create the table and populate sample data SELECT name = ‘derek’ ,detail = ‘this is record 1’ INTO #faba — perform the UNION on the same table SELECT * FROM #faba UNION SELECT * FROM #faba — perform the UNION ALL on the same table SELECT * FROM #faba UNION ALL SELECT * FROM #faba [/cc] From the results we see below, the UNION only returns one record. The execution plans are identical except for the distinct operation being performed as the last operation before output. From this example we see how much utilization distinct is consuming relational to the other operations. (It’s a lot, however this is a rather simple operation) Long story short, if you know you’re two separate result sets contain distinct values, it’s best to use UNION ALL. If for some reason […]
Continue reading ...
The WITH (nolock) hint is an explicit command directed at a specific table or view used to set the transaction isolation level against the table or tables within a view for a query. Once issued, locks will not be used against the data within the table. The advantage to this is there is no chance a deadlock will occur against any other queries running against the table. The other indirect advantage is that less memory will be used in order to hold locks against that data. Example: [cc lang=”sql”] SELECT first_name, last_name, FROM dbo.person p WITH (NOLOCK) JOIN dbo.employee e WITH (NOLOCK) ON e.person_id = p.person_id WHERE p.person_id = 1; [/cc] The nolock setting above is explicit for the table it is being set against. To set the value globally for the scope of the connection, see SET TRANSACTION ISOLATION LEVEL Advantages: Deadlocks will not occur against other queries running against the same data Less memory is utilized due to the lack of row, page, or range level locking Typically allows for much higher concurrency due to lower footprint Disadvantages: Uncommitted data can be read leading to dirty reads Explicit hints against a table are generally bad practice Usage In most places I have worked, with (nolock) has been a generally accepted practice in the specific areas of the system that are not sensitive to data being slightly out of sync. It is important to know where things could go wrong though. The biggest red flag I can think of […]
Continue reading ...
A common situation in a data warehouse is the requirement to detect changes in data in order to track what rows need to be imported. The traditional method of comparing the values of each field is performance intensive. Luckily there are other methods to quickly track the changes that involve creating hashes (or a fingerprint) of a particular data row. In using this method, if we want to synchronize two separate tables, we can simply join on the primary key and only compare this fingerprint column in order to determine what has changed. There are two major methods I’ve used to create a row valued hash key. The first is by using the CHECKSUM function. The other is to use the HASHBYTES function. Both of these function return back a single value representing a hash, however their parameters differ. With CHECKSUM you can pass in a list of columns to evaluate and it returns an integer value. Whereas HASHBYTES requires a single parameter to be passed in and returns back a 16-bit binary value. The trick to forcing HASHBYTES into accepting multiple column values is to use the FOR XML function which will generate a single value to pass in. The obvious difference between the two functions is the size and datatype of the hash being returned. To make a long story short, there are rare occasions (that I have witnessed more than once) where passing in different column values into CHECKSUM will return back the exact same value. Granted […]
Continue reading ...
If you have ever tried to order numerical values in a varchar field you’ll know that the sort doesn’t occur numerically as it would if the datatype were numeric. In other words, the values of 1 and 10 will be sorted together because they both start with a leading 1. To overcome this, we have to cast the values as numeric. But this raises another issue. Since it’s a varchar field we cannot ensure that the values are all numeric which means we additionally have to take into account characters. Let’s start by creating a sample table and populating some values: [cc lang=”sql”] CREATE TABLE #varchar_field ( ID INT IDENTITY(1,1), mixed_field varchar(100), ) INSERT INTO #varchar_field (mixed_field) SELECT ‘1’ union all SELECT ‘4.9’ union all SELECT ’10’ union all SELECT ’50’ union all SELECT ‘6’ union all SELECT ‘a’ union all SELECT ‘z’ union all SELECT ‘A’ [/cc] Now let’s run a simple sort so we can see the default behavior. [cc lang=”sql”] SELECT * FROM #varchar_field ORDER BY mixed_field [/cc] Here we see everything is only sorted by it’s leading character, not taking into consideration the value of the numbers. Now let’s execute a revised version of this order by. [cc lang=”sql”] SELECT * FROM #varchar_field ORDER BY CASE WHEN ISNUMERIC(mixed_field) = 1 THEN CAST(mixed_field AS FLOAT) WHEN ISNUMERIC(LEFT(mixed_field,1)) = 0 THEN ASCII(LEFT(LOWER(mixed_field),1)) ELSE 2147483647 END [/cc] Here we check to see if the field is numeric first. If it is, we cast it to float to deal with […]
Continue reading ...