Find Best Clustered Index

The following query will compare the nonclustered indexes vs the clustered index and determine which index would qualify as the best clustered index based upon the DMV statistics.

Keep in mind the suggestions returned by this query do not take into consideration fragmentation that may occur on the suggested keys. Use these suggestions only after your own evaluations.

If a clustered index does not exist on the table, it will also suggest one of the existing nonclustered indexes as the clustered index.

[cc lang=”sql”]
DECLARE @NonClusteredSeekPct float
DECLARE @ClusteredLookupFromNCPct float

— Define percentage of usage the non clustered should
— receive over the clustered index
SET @NonClusteredSeekPct = 1.50 — 150%

— Define the percentage of all lookups on the clustered index
— should be executed by this non clustered index
SET @ClusteredLookupFromNCPct = .75 — 75%

SELECT
TableName = object_name(idx.object_id)
,NonUsefulClusteredIndex = idx.NAME
,ShouldBeClustered = nc.NonClusteredName
,Clustered_User_Seeks = c.user_seeks
,NonClustered_User_Seeks = nc.user_seeks
,Clustered_User_Lookups = c.user_lookups
,DatabaseName = db_name(c.database_id)
FROM sys.indexes idx
LEFT JOIN sys.dm_db_index_usage_stats c
ON idx.object_id = c.object_id
AND idx.index_id = c.index_id
–AND c.database_id = @DBID
JOIN (
SELECT
idx.object_id
,nonclusteredname = idx.NAME
,ius.user_seeks
FROM sys.indexes idx
JOIN sys.dm_db_index_usage_stats ius
ON idx.object_id = ius.object_id
AND idx.index_id = ius.index_id
WHERE idx.type_desc = ‘nonclustered’
AND ius.user_seeks =
(
SELECT MAX(user_seeks)
FROM sys.dm_db_index_usage_stats
WHERE object_id = ius.object_id
AND type_desc = ‘nonclustered’
)
GROUP BY
idx.object_id
,idx.NAME
,ius.user_seeks
) nc
ON nc.object_id = idx.object_id
WHERE
idx.type_desc IN (‘clustered’,’heap’)
— non clustered user seeks outweigh clustered by 150%
AND nc.user_seeks > (c.user_seeks * @NonClusteredSeekPct)
— nc index usage is primary cause of clustered lookups 80%
AND nc.user_seeks >= (c.user_lookups * @ClusteredLookupFromNCPct)
ORDER BY nc.user_seeks DESC

[/cc]

The way it performs this determination is by comparing the lookups for a particular nonclustered index to the current number of seeks for the current clustered index.

4 comments
Darek 20 May 2015 at 4:18 pm

Hi there. Derek, is it not right that—at least on an OLTP system—a clustered index should be: static, narrow, monotonic and unique? I know that Paul Randal/Brent Ozar recommends this kind of set up. He also states the reasons. The best column for a clustered index is either an integer IDENTITY or a monotonically increasing GUID. The indexes that have many columns in them should almost exclusively be non-clustered indexes. Problem with indexes with many columns is that it’s almost impossible to guarantee that no columns will change. Hence the ‘narrow’ condition is not met. But there are more issues with clustered indexes that are not described by the properties that I’ve outlined above. Best regards, Darek.

Derek Dieter 11 Jun 2015 at 4:25 am

Hi Darek,

I absolutely agree. I would say 95-98% of the time, the clustered key should be a single unique key (I prefer identity). Here is a newer version of the article where I explain that more in depth (http://sqlserverplanet.com/indexes/choosing-the-best-clustered-index)

There are some cases however where I think a different key can better serve as a clustered key. An example of this could be the classic order / orderline table combo. While OrderLineID would may the obvious choice as the clustered key, sometimes the foreign key OrderID could be a better choice as a clustered key within orderline. The reason is, orderlineid is never referenced outside the context of an OrderID, and unless you create a covering index with all columns included, you can run into so deadlock scenarios. Also, orderid within orderline is a transactional table and would not be updated so fragmentation shouldn’t *really* be an issue.

With that said, I agree it is still a much safer bet to go with a single unique narrow key as the clustered index. With a single narrow key, a datawarehouse can create an index on modifieddate, include the clustered primary key, and then batch all records to fetch based off that.

Shiv Das 24 Apr 2012 at 11:24 pm

Hi Derek,

I executed the code but it is not displaying any record.

Derek Dieter 25 Apr 2012 at 12:27 am

That’s probably a good thing Shiv. Either most of your queries are using already using the clustered indexes, or your non clustered indexes contain the necessary columns to avoid performing lookups. Oh, and make sure you have the right database selected.

Featured Articles

 Site Author

  • Thanks for visiting!