Using the Merge Statement to populate a historical table with Effective Dating

Posted on June 3, 2009 by Derek Dieter
3

One of my favorite uses for the MERGE statement introduced in SQL Server 2008 is the updating of a historical table. With versions prior to 2008 this operation had to be performed in two separate statements. Merge helps us to streamline the process. The advantage to the database engine when using a Merge statement is in it’s minimal locking. Prior to merge, we had to update the record if it existed. And if it did not exist, we inserted the record. Long story short, it was double the amount of locking before the optimizer knew which rows to exclusively lock. To understand what’s going on in the statement below, you have to read starting from the MERGE statement. This is where the initial updating is taking place first. As you scroll down, you will see the OUTPUT clause. This actually outputs the results of the MERGE statment. You can think of this output as a virtual table named “merged”. The $action variable is an intrinsic column (if you will), that contains the action that row played during the MERGE. Lastly, if you scroll to the top, you will find the insert statement that finally inserts the new “current” record for the corresponding “expired” record we just retired. This was the same row we used for the criteria to UPDATE the expired row, however we did not use any of it’s values. [cc lang=”sql”] DECLARE @Now datetime = GETDATE() DECLARE @EffToDate datetime = ‘2079-06-06T00:00:00.000’ DECLARE @JobID int = 1 — This […]

Transferring Large Amounts of Data using Batch Inserts

Posted on May 31, 2009 by Derek Dieter
10

Below is a technique used to transfer a large amount of records from one table to another. This scales pretty well for a couple reasons. First, this will not fill up the entire log prior to committing the transaction. Rather, it will populate the table in chunks of 10,000 records. Second, it’s generally much quicker. You will have to play around with the batch size. Sometimes it’s more efficient at 10,000, sometimes 500,000, depending on the system. If you do not need to insert into an existing table and just need a copy of the table, it is better to do a SELECT INTO. However for this example, we are inserting into an existing table. Another trick you should do is to change the recovery model of the database to simple. This way, there will be much less logging in the transaction log. The WITH (TABLOCK) below only works in SQL 2008. [cc lang=”sql”] DECLARE @BatchSize int = 10000 WHILE 1 = 1 BEGIN INSERT INTO [dbo].[Destination] –WITH (TABLOCK) — Uncomment for 2008 ( FirstName ,LastName ,EmailAddress ,PhoneNumber ) SELECT TOP(@BatchSize) s.FirstName ,s.LastName ,s.EmailAddress ,s.PhoneNumber FROM [dbo].[Source] s WHERE NOT EXISTS ( SELECT 1 FROM dbo.Destination WHERE PersonID = s.PersonID ) IF @@ROWCOUNT < @BatchSize BREAK END [/cc] With the above example, it is important to have at least a non clustered index on PersonID in both tables. Another way to transfer records is to use multiple threads. Specifying a range of records as such: [cc lang="sql"] INSERT INTO [dbo].[Destination] […]

Site Author

Thanks for visiting!

Using the Merge Statement to populate a historical table with Effective Dating

Transferring Large Amounts of Data using Batch Inserts

Featured Articles

Site Author

Resources