The SQL ROW_NUMBER()
function is a window function that assigns and returns a row number of each row in a query partition or result set. Numbering starts at 1 and increments sequentially.
This enables us to add a “row number” column to our queries.
Example
Here’s an example of how it works:
SELECT
ROW_NUMBER() OVER ( ORDER BY Score ) AS "Row Number",
DogName,
Activity,
Score
FROM Dogs;
Result:
+------------+---------+-------------+-------+ | Row Number | DogName | Activity | Score | +------------+---------+-------------+-------+ | 1 | Max | Keep Quiet | 1 | | 2 | Bruno | Keep Quiet | 2 | | 3 | Cooper | Keep Quiet | 8 | | 4 | Bruno | Fetch Stick | 43 | | 5 | Cooper | Wag Tail | 51 | | 6 | Bruno | Wag Tail | 65 | | 7 | Cooper | Fetch Stick | 67 | | 8 | Max | Wag Tail | 87 | | 9 | Max | Fetch Stick | 91 | +------------+---------+-------------+-------+
We use an OVER
clause to determine how the row set is partitioned and ordered before the window function is applied.
In this case I didn’t apply any partitioning, but I did specify that the results should be ordered by the Score
column. Either way, the Row Number
column increments from 1
to 9
.
Omitting the ORDER BY
Clause
Some RDBMSs allow us to omit the ORDER BY
clause, but others will return an error.
MySQL for example, allows us to omit the ORDER BY
clause, but when we do the ordering is nondeterministic:
SELECT
ROW_NUMBER() OVER () AS "Row Number",
DogName,
Activity,
Score
FROM Dogs;
Result:
+------------+---------+-------------+-------+ | Row Number | DogName | Activity | Score | +------------+---------+-------------+-------+ | 1 | Bruno | Fetch Stick | 43 | | 2 | Cooper | Fetch Stick | 67 | | 3 | Max | Fetch Stick | 91 | | 4 | Bruno | Wag Tail | 65 | | 5 | Cooper | Wag Tail | 51 | | 6 | Max | Wag Tail | 87 | | 7 | Bruno | Keep Quiet | 2 | | 8 | Cooper | Keep Quiet | 8 | | 9 | Max | Keep Quiet | 1 | +------------+---------+-------------+-------+
So in this case, we can see that the Row Number
column increments the same as it did in the previous example (i.e. from 1
to 9
), but the actual ordering of the rows is different. It’s no longer sorted by the Score
column, and as mentioned, the ordering is nondeterministic.
On the other hand, omitting the ORDER BY
clause in SQL Server causes an error. Running the above code in SQL Server returns the following error:
Msg 4112, Level 15, State 1, Line 2 The function 'ROW_NUMBER' must have an OVER clause with ORDER BY.
Partitioning
When used with partitions, the ROW_NUMBER()
function returns the row’s number within its partition. In other words, the numbering resets with each new partition.
In SQL we create partitions with the PARTITION BY
clause:
SELECT
ROW_NUMBER( )
OVER (
PARTITION BY Activity
ORDER BY Score
)
AS "Row Number",
DogName,
Activity,
Score
FROM Dogs;
Result:
+------------+---------+-------------+-------+ | Row Number | DogName | Activity | Score | +------------+---------+-------------+-------+ | 1 | Bruno | Fetch Stick | 43 | | 2 | Cooper | Fetch Stick | 67 | | 3 | Max | Fetch Stick | 91 | | 1 | Max | Keep Quiet | 1 | | 2 | Bruno | Keep Quiet | 2 | | 3 | Cooper | Keep Quiet | 8 | | 1 | Cooper | Wag Tail | 51 | | 2 | Bruno | Wag Tail | 65 | | 3 | Max | Wag Tail | 87 | +------------+---------+-------------+-------+
Here, I partitioned the results by the Activity
column. There are three activities, and so we have three sets of numbering.
In this case, each partition contains three rows, and so the numbering goes from 1 to 3 in each partition. If one of the partitions contained say, four rows, then that partition would be numbered from 1 to 4.