Skip to content

SQL Server Add Primary Key

Adding a primary key can be done either after a table is created, or at the same a table is created. It is important to note, that by default a primary key is clustered. This may or may not be the preferred method of creation. For more information on clustered vs non clustered indexes, please see this article. Either way, you can specify the clustered / non clustered option when creating a primary key.

To create a clustered primary key on an existing table:

ALTER TABLE dbo.Person ADD PRIMARY KEY (PersonID);

To create a non clustered primary key on an existing table:

ALTER TABLE dbo.Person ADD PRIMARY KEY NONCLUSTERED (PersonID);

To create a composite primary key on an existing table:

ALTER TABLE dbo.Person ADD PRIMARY KEY (PersonID, DOB);

For an existing table, there are two places that can be used to specify the primary key. The first is inline to the column. Using this method, you cannot create a composite primary key:

CREATE TABLE [dbo].[Person](
	PersonID [int] IDENTITY(1,1) PRIMARY KEY NONCLUSTERED NOT NULL,
	FirstName VARCHAR(50) NULL
)

The second way is to specify the primary key after the creation of the table, this will enable you to create a composite primary key. Keep in mind that in order to use a column in the primary key, the field must not be nullable.

CREATE TABLE [dbo].[Person](
	PersonID [int] IDENTITY(1,1) NOT NULL,
	FirstName VARCHAR(50) NOT NULL
	PRIMARY KEY NONCLUSTERED (PersonID, FirstName)
)

SQL Server Select

The SQL Server Select statement is the first statement used when returning data. It is the most used and most important statement in the T-SQL language. The Select statement has many different clauses. We will step through each clause further in the tutorial, however now, we will look at Select itself.

The following

SELECT 'Mike' AS FirstName

As we can see from our results, the code returns one row, one column. The data of the column returned is the value: ‘Mike’. The column’s name is ‘FirstName’.
Select_Statement_Ex_1

The ‘AS’ allows you to specify a column alias. This alias could be anything, in this case we made it ‘FirstName’. We can also Select multiple columns of data by separating the columns using a comma:

SELECT 'Mike' AS FirstName, 'Nichols' AS LastName
Returning two columns

Returning two columns

These examples so far, have only used the Select statement by itself. Now we will explore using Select in addition with the ‘From’ clause. The From Clause allows a source to be specified for the Select Statement. For the purposes of this example, the source is always a table. Let’s look at an example.

SELECT *
FROM Person.Person

Phonetically the above query is pronounced: Select star from Person dot Person. The star (*) or asterisks is notation for saying “all columns”. Basically, the query is saying give me all columns from the Person table contained within the Person Schema. The person schema is basically a class, or a way to better organize tables. This example will return every column for every row of the Person.Person table. We could also specify specific columns in order to limit the results:

SELECT
	FirstName,
	MiddleName,
	LastName
FROM Person.Person

This query would return all rows for the three columns listed. Now suppose we did not want to return all the columns listed. Instead, we only want to return people with the last name of ‘Adams’. This is where the ‘From’ clause comes in.

SELECT
	FirstName,
	MiddleName,
	LastName
FROM Person.Person
WHERE LastName = 'Adams'

The where clause is followed by the column name we want to filter. The equal sign separates the value we want the column to be equal to. And we place ‘Adams’ within single quotes because it is a string value (or not strictly numeric). This now returns only people that have the last name of Adams.

The Where clause is where the power of SQL begins to get introduced. We can also provide more specifics regarding the rows we want to return by adding additional criteria in the where clause. We do this by specifying the AND statement.

SELECT
	FirstName,
	MiddleName,
	LastName
FROM Person.Person
WHERE LastName = 'Adams' AND FirstName = 'John'

This query now only returns the rows where the FirstName is ‘John’ and LastName is ‘Adams’. Suppose we wanted either people with the last name of ‘Adams’ OR people with the last name of ‘Smith’. In order to return either of these names, we use the OR statement.

SELECT
	FirstName,
	MiddleName,
	LastName
FROM Person.Person
WHERE LastName = 'Adams' OR LastName = 'Smith'

Cannot insert explicit value for identity column in table ‘table’ when IDENTITY_INSERT is set to OFF.

This error occurs when trying to insert into a column containing an identity. An Identity column is not able to be inserted into without the use of a special command mentioned below. Identity columns are columns that automatically increment when a value is inserted into a row. They are commonly used as primary keys because they guarantee uniqueness.

In order to insert into a table containing an identity column

SET IDENTITY INSERT ‘tablename’ ON

IDENTITY INSERT ON can only be executed by a user having dbo privilidges

The following example illustrates the error and shows how to successfully insert.

-- Create MyNames Table with Identity Column
CREATE TABLE dbo.MyNames
(
	ID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
	FName varchar(50) NULL
)

INSERT INTO dbo.MyNames
(
	FName
)
SELECT 'Abe'
UNION
SELECT 'Henry'
UNION
SELECT 'Phil'

-- Create YourNames Table with Identity Column
CREATE TABLE dbo.YourNames
(
	ID int IDENTITY(4,1) NOT NULL PRIMARY KEY, --Start Incrementing at 4
	FName varchar(50) NULL
)

INSERT INTO dbo.YourNames
(
	FName
)
SELECT 'Bill'
UNION
SELECT 'Candy'
UNION
SELECT 'Sara'

-- Attempt to Insert MyNames Identity Into YourNames
INSERT INTO dbo.YourNames
(
	ID,
	FName
)
SELECT
	ID,
	Fname
FROM dbo.MyNames

-- we get the error message:
Cannot insert explicit value for identity column in table 'YourNames' when IDENTITY_INSERT is set to OFF.

-- Attempt to Insert after turning on IDENTITY_INSERT ON
SET IDENTITY_INSERT dbo.YourNames ON

INSERT INTO dbo.YourNames
(
	ID,
	FName
)
SELECT
	ID,
	Fname
FROM dbo.MyNames

SET IDENTITY_INSERT dbo.YourNames OFF

SQL Server Kill

The kill command is used against a SPID. (Server Process ID). This command is typically used because something is misbehaving. In order to use the Kill command, simply type the word “kill” followed by a space and then the number of the Server Process ID to kill.

-- Kill Server process ID 98
kill 98

The kill command has three parameters. Two of which I have no idea what they mean:

KILL { session ID | UOW } [ WITH STATUSONLY ]

Best SQL Server Pagination Method

The best pagination method is actually mind-bogglingly simple. Aren’t all things like this?

It involves using (the mighty powerful) ROWCOUNT statement — twice.

Anyways, the code:

CREATE PROCEDURE [dbo].[spUsersPaged]
(
    @RowBegin int,
    @MaxRows int
)
AS
BEGIN

	DECLARE @StartID int	

	SET ROWCOUNT @RowBegin
	SELECT @StartID = UserID
	FROM dbo.Users u
	ORDER BY UserID

	-- Set maximum number of rows to return
	SET ROWCOUNT @MaxRows

	SELECT *
	FROM dbo.Users u
	WHERE UserID >= @StartID
	ORDER BY u.UserID

	SET ROWCOUNT 0

END

That’s it for the first pagination sample. Well, aren’t we excited? You may try other methods, however I challenge all Ye to overcome thy double ROWCOUNT method. Comment below please.

To give a variety, I’m sharing another pagination method. This one seems more elegant. It uses a Common Table Expression) CTE, and the RowNumber function.

SET @rowsPerPage = 10
SET @pageNum = 3  

WITH SQLPaging
AS
(
SELECT TOP(@rowsPerPage * @pageNum)
ResultNum = ROW_NUMBER() OVER (ORDER BY id)
,id
FROM dbo.Table1
)
SELECT *
FROM SQLPaging
WHERE ResultNum > ((@pageNum - 1) * @rowsPerPage)

Yet is it elegant, with speed?? ..uh not so much.

Using sp_who2

This article shows the usage of sp_who2. To diagnose system slowdowns, see (Troubleshooting SQL Slowness).

One of the first lines of defense in determining the causes of database slowdowns is to use sp_who2. sp_who2 shows all the sessions that are currently established in the database. These are denoted as SPID’s, or Server process Id’s. Running sp_who2 is easy, all that is required is to type sp_who2 and execute it.

sp_who2
Results of sp_who2

Results of sp_who2

There is no exec needed. sp_who2 also takes a optional parameter of a SPID. If a spid is passed, then the results of sp_who2 only show the row or rows of the executing SPID.

The first 50 results are system SPIDs. Generally these do not effect slowdowns of the system. These system events include the Checkpoint writer, Log writers and task schedulers. User processes are SPID numbers 50 and over. In diagnosing slowdowns it is these SPIDs that are the potential resource hogs.

There are four main things to look for when when diagnosing slowdowns.

  1. Blocking
  2. High CPU usage
  3. High IO usage
  4. Multiple entries for the same SPID representing parallelism.

When a number is shown in the column named BlkBy, this represents the SPID that is currently stopping the SPID in the row shown. Sometimes many rows will show SPID numbers in the BlkBy column. This is because there is a chain of blockers. The way this occurs usually starts with one “lead” blocker blocking another process. In turn, the process that is being blocked, blocks others. This occurs down a chain. It can be a messy situation. In order to rectify, you may have to kill the lead blocker. If it happens often, you will want to research why this particular process is blocking. So, before you kill any process, find out what statement it is running first. To do this, execute DBCC Inputbuffer.

High CPUTime or High DiskIO time is usually spotted by comparing the relative CPUTime to the DiskIO time. It should be noted that CPUTime and DiskIO time represent the sum of all executions since the SPID has been active. It may take some training before you are able to spot a high number here. At times, you will see very high CPUTimes and almost no corresponding DiskIO. This is usually indicative of a bad execution plan. For more information see this article (slow performance.

Multiple rows that have the same SPID are known as Parallel processes. This happens when SQL Server has determined that a particular query is going to take a long time to run (according to the parallel settings at the server level). When this happens, SQL Server will launch multiple threads for the same procedure. When this happens often, it can have a devastating effect on SQL Server. Particularly IO. To rectify, either raise the query threshold for parallelism or turn down the maximum processors that parallelism can use (MAXDOP) Max degree of parallelism, a setting at the server level.

sp_who2 does provide limited information regarding slowdowns. For a more in depth procedure look at sp_who3

Cannot Log in Remotely to SQL Server Express

SQL Server Express 2008 does not allow remote connections by default.  This is done by design as SQL Express is often used either as a development tool, or as a local installation for resident applications.  Another thing to note, is that SQL Express uses dynamic ports by default so in order to connect, The SQL Server Browser service needs to be started on the same machine that Express is installed on.

Below are the steps to perform:

  1. Enable TCP Connections for SQL Server Express
    • Start -> Programs -> Microsoft SQL Server 200x -> Configuration Tool -> SQL Server Configuration Manager
    • Expand SQL Server Network Configuration
    • Click Protocols for SQLExpress
    • In the right hand pane, right click TCP/IP and choose ‘Enable’
  2. Make sure SQL Server Browser Service is started. Without this service enabled, you will not be able to connect (unless you specified a static TCP port).

As long as you do not have any firewall issues, you should be able to connect

SQL Management Tools

Working in Administration or Development, we find ourselves performing the same tasks over and over. Often we do not recognize the time each of these menial tasks take and how much time out of our day they waste. Luckily, there are external tools we can rely on to make our lives easier.

SSMS Tools Pack

This is my favorite.  Or should I say our development teams’ favorite. Best of all, it is free.   After using some of the tools this plugin has to offer, there is no going back.

  1. Window Color Coding – Have you ever executed code on a server you shouldn’t have?  (Of course you have).  This plugin will put an end to that.  Out of the box, it will add a distinct colored border at the top of your Query Windows that will be different for each server you are on.  I use to mistaken servers (Dev, Prod, QA) not anymore.  This alone is worth the price of the plugin.. (Oh wait, it’s free).
  2. Format SQL – Yes, this is self explanatory.  Are you a lowercase coder, or uppercase coder?  (UPPER CASE RULES!).  Anyway, however you want your keywords to appear, this will help with the stroke of two quick keys.
  3. SQL Snippets – Store snippets of SQL and insert them into the query window with one quick keypress.

To download SSMS Tools Pack, click here

Compare Stored Procedure Dataset Output

Comparing the results of two different stored procedures output can be difficult task. The first difficulty is to return the result sets of the stored procedures into temporary tables. There are two different methods of doing this outlined in the link above. For this example we will use the local linked server method. Your method may differ depending upon whether your system setup will support one of the methods. It may also differ if you are looking at passing dynamic parameters (this method is also covered in the link above).

After the result sets are in the temp tables, we need to compare the datasets. We typically need to see the data in one of three ways:

  1. What is in one result set and not in the other
  2. All results that do not match
  3. All results that do match

First, let’s dump the procedures into tables:

-- Where local server is the name we gave to our local linked server
SELECT *
INTO #Procedure1
FROM OPENQUERY(LOCALSERVER, 'set fmtonly off exec dbo.Procedure1 param')

SELECT *
INTO #Procedure2
FROM OPENQUERY(LOCALSERVER, 'set fmtonly off exec dbo.Procedure2 param')

Now, considering we have SQL 2005 and above, we will compare what is in Procedure1 that does not match what is in procedure2. We do this using the new EXCEPT Statement.

-- Give us what IS in procedure1
SELECT *
FROM #Procedure1

EXCEPT
-- That does not match procedure2
SELECT *
FROM #Procedure2

Next, we will get what is in procedure2 and not in procedure1.

-- Give us what IS in procedure2
SELECT *
FROM #Procedure2

EXCEPT
-- That does not match procedure1
SELECT *
FROM #Procedure1

The next result set comparison is to get all results from both procedures that do not match. We do this using the CHECKSUM function. We do not need SQL 2005+ for this.

SELECT *
FROM (
	SELECT ID
		,ChkSum = CHECKSUM(*)
	FROM #Procedure1 ds
) ds
JOIN(
	SELECT ID
		,ChkSum = CHECKSUM(*)
	FROM #Procedure2 ps
) ps
ON ds.ID = ps.ID
AND ds.ChkSum <> ps.ChkSum

Lastly, let’s get what does match in both procedures. We can do this using either INTERSECT (2005+) or CHECKSUM.

-- Using intersect, we don't need to worry about the order of the table selects
SELECT *
FROM #Procedure1
INTERSECT
SELECT *
FROM #Procedure2 ds
SELECT *
FROM (
	SELECT ID
		,ChkSum = CHECKSUM(*)
	FROM #Procedure1 ds
) ds
JOIN(
	SELECT ID
		,ChkSum = CHECKSUM(*)
	FROM #Procedure2 ps
) ps
ON ds.ID = ps.ID
AND ds.ChkSum <> ps.ChkSum

Configure Change Data Capture Settings

After setting up change data capture (CDC), there are a few configuration options that allow for optimal performance based upon applications load.

Using the default values, the most number of transactions that can be grabbed every 5 seconds is 5000. Remember, a transaction is not the same as a database row. One transaction can update “n” number of rows.

In order to determine latency for you CDC solution, you will need to query a DMV view:

select latency, *
from sys.dm_cdc_log_scan_sessions

The latency will be shown in seconds, if it is getting too high, start increasing maxtrans, maxscans, and decrease the polling interval.

  • polling interval – (default 5 seconds) the amount of time to wait between log scans
  • maxtrans – (default 500) the number of transactions to grab with each scan
  • maxscans – (default 10) the number of scans performed between each polling interval

In order to change these settings, use the system stored procedure (sys.sp_cdc_change_job) after changing the database which contains your capture instance. The CDC job needs to be restarted after making these changes.

-- Change to the database that contains your capture instance
use YourDatabaseHere
go
exec sys.sp_cdc_change_job @job_type = 'capture'
						,@maxtrans = 501
						,@maxscans = 10
						,@continuous = 1
						,@pollinginterval = 5

Changing the latency between the cleanup is done using the same stored procedure:

exec sys.sp_cdc_change_job @job_type = 'cleanup'
						,@retention = 4320 -- Number of minutes to retain (72 hours)
						,@threshold = 5000

To view the results of the changes, execute the following:

-- View the results of the changed parameter is the cdc_jobs table
SELECT *
FROM msdb.dbo.cdc_jobs