Selecting Top N Records per Group

When you have multiple rows for let’s say a productID, there will probably be times when you need to select the top n subset of each those productid’s based upon a sort criteria. This is a relatively common TSQL need and there are a couple different methods to do this.

First let’s create some dummy data.

[cc lang=”sql”]
CREATE TABLE [#productzabub](
[SalesOrderID] [int] NOT NULL,
[ProductID] [int] NOT NULL
)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 707)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 707)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 707)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 708)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 708)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 708)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 711)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 711)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 711)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 712)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 712)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 712)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 714)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 714)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 714)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71782, 715)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 715)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 715)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71783, 716)
INSERT [#productzabub] ([SalesOrderID], [ProductID]) VALUES (71784, 716)
[/cc]

Now that we have our data created, let’s select the top 2 rows for each productid based upon the earliest salesorderid. This salesorderid could actually represent any sequential column in the table, be it a date, or a primary key.

The first method we’ll use is to select using the new window function ROW_NUMBER() introduced in SQL 2005. What we’re doing here is simply creating a numbering system that enumerates each instance of the productid ordered by the salesorderid. We then do another select on this query.
[cc lang=”sql”]
SELECT TOP 20
t.SalesOrderID,
t.ProductID
FROM (
SELECT
sod.SalesOrderID,
sod.ProductID,
rowid = ROW_NUMBER() OVER (PARTITION BY ProductID ORDER BY SalesOrderID)
FROM #productzabub sod
) t
WHERE rowid <= 2 [/cc] The next method we'll use is compatible with all versions of SQL. Here we are doing a self-join using a correlated subquery. Meaning we are correlating the productid from the outer query, with the productid from the inner query. We then select the top two salesorderid's from each instance, constraining our outer result set to the top 2 rows for product. [cc lang="sql"] SELECT sod.SalesOrderID, sod.ProductID FROM #productzabub sod WHERE SalesOrderID IN ( SELECT TOP 2 SalesOrderID FROM #productzabub WHERE ProductID = sod.ProductID ORDER BY SalesOrderID ) [/cc] That's pretty much it. Personally I prefer using the window functions when I can, they typically have a speed advantage over other methods because they don't join anything, just sort by doing an order by internally. You may see other methods using a CTE. This is not actually another method, it’s simply the same as the windowed function execution-wise.

7 comments
Cecil 23 Dec 2015 at 4:10 pm

I just took a looping query’s execution time from 40 seconds to 1 second with this info. THANK YOU! Awesome stuff. I learned something new today.

noe 12 Nov 2015 at 2:40 pm

very good thanks…!!!

Asoka 03 Nov 2015 at 5:46 pm

Of great help… thank you!

Rienk 24 Sep 2015 at 7:23 am

Thanks a lot!

gavin 03 Nov 2014 at 4:18 pm

really helped with something, thanks a lot

Gonzalo 25 Jan 2014 at 7:25 am

Just to say Thank You!!! You are able to explain quite simple something complicated to explain for others.

NATARAJ 20 Aug 2013 at 10:02 am

It very good.

Featured Articles

 Site Author

  • Thanks for visiting!
css.php