Understanding the DENSE_RANK() Function in MySQL

In MySQL, the DENSE_RANK() function is a window function that returns the rank of the current row within its partition, without gaps.

By “without gaps” this means that it returns contiguous rank numbers whenever there are peers. Peers are considered ties and receive the same rank. In such cases, the next rank value is one greater than the current one (i.e. the one that the peers receive).

This is in contrast to the RANK() function, which returns noncontiguous rank numbers (i.e. it assigns peers the same rank value, but there’s a gap between that value and the next rank value).

Syntax

The syntax goes like this:

DENSE_RANK() over_clause

So it requires an OVER clause, which determines how the rowset is partitioned and ordered before the window function is applied.

Example

Suppose we have a table with the following data:

SELECT * FROM Idiots;

Result:

+---------+-------------------+------+
| IdiotId | IdiotName         | IQ   |
+---------+-------------------+------+
|       1 | Dumb              |   60 |
|       2 | Dumber            |   50 |
|       3 | Dumbest           |   30 |
|       4 | Homer             |   40 |
|       5 | Peter Griffin     |   50 |
|       6 | Patrick Star      |   40 |
|       7 | Ed                |   40 |
|       8 | Ralph Wiggum      |   65 |
|       9 | Cosmo             |   55 |
|      10 | Richard Watterson |   70 |
+---------+-------------------+------+

The following query uses DENSE_RANK() against the IQ column:

SELECT 
    IdiotName,
    IQ,
    DENSE_RANK() OVER ( ORDER BY IQ ) AS "Dense Rank"
FROM Idiots;

Result:

+-------------------+------+------------+
| IdiotName         | IQ   | Dense Rank |
+-------------------+------+------------+
| Dumbest           |   30 |          1 |
| Homer             |   40 |          2 |
| Patrick Star      |   40 |          2 |
| Ed                |   40 |          2 |
| Dumber            |   50 |          3 |
| Peter Griffin     |   50 |          3 |
| Cosmo             |   55 |          4 |
| Dumb              |   60 |          5 |
| Ralph Wiggum      |   65 |          6 |
| Richard Watterson |   70 |          7 |
+-------------------+------+------------+
10 rows in set (0.00 sec)

Here, the Dense Rank column contains the output of the DENSE_RANK() function.

In this case we have two instances of a tie. Three idiots are tied with an IQ of 40 and two other idiots have an IQ of 50. We can see that those idiots with the same IQ share the same rank.

However, even when they share the same rank, the next rank value is contiguous. In other words, there are no gaps between the current (shared) rank value and the next one.

By contrast, if we’d used the RANK() function, we would see gaps. To demonstrate, let’s add a RANK() column to our query:

SELECT 
    IdiotName,
    IQ,
    DENSE_RANK() OVER win AS "Dense Rank",
    RANK() OVER win AS "Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ );

Result:

+-------------------+------+------------+------+
| IdiotName         | IQ   | Dense Rank | Rank |
+-------------------+------+------------+------+
| Dumbest           |   30 |          1 |    1 |
| Homer             |   40 |          2 |    2 |
| Patrick Star      |   40 |          2 |    2 |
| Ed                |   40 |          2 |    2 |
| Dumber            |   50 |          3 |    5 |
| Peter Griffin     |   50 |          3 |    5 |
| Cosmo             |   55 |          4 |    7 |
| Dumb              |   60 |          5 |    8 |
| Ralph Wiggum      |   65 |          6 |    9 |
| Richard Watterson |   70 |          7 |   10 |
+-------------------+------+------------+------+
10 rows in set (0.01 sec)

In addition to adding the RANK() function, I also moved the ORDER BY clause to a named window. This saved me from having to replicate the same clause multiple times in the SELECT list. I was able to simply provide the name of the named window (in this case, win).

Anyway, we can see that DENSE_RANK() returns 3 when RANK() returns 5, which affects all subsequent rank values. And after the next tie, DENSE_RANK() returns 4 when RANK() returns 7, which further affects all subsequent values.

Reorder the Rankings

We can also order the results differently. The above examples order the results by the IQ column in ascending order. In those examples, the idiot with the lowest IQ gets the highest rank.

Let’s switch this around so that the idiot with the lowest IQ gets the lowest rank:

SELECT 
    IdiotName,
    IQ,
    DENSE_RANK() OVER win AS "Dense Rank",
    RANK() OVER win AS "Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ DESC );

Result:

+-------------------+------+------------+------+
| IdiotName         | IQ   | Dense Rank | Rank |
+-------------------+------+------------+------+
| Richard Watterson |   70 |          1 |    1 |
| Ralph Wiggum      |   65 |          2 |    2 |
| Dumb              |   60 |          3 |    3 |
| Cosmo             |   55 |          4 |    4 |
| Dumber            |   50 |          5 |    5 |
| Peter Griffin     |   50 |          5 |    5 |
| Homer             |   40 |          6 |    7 |
| Patrick Star      |   40 |          6 |    7 |
| Ed                |   40 |          6 |    7 |
| Dumbest           |   30 |          7 |   10 |
+-------------------+------+------------+------+
10 rows in set (0.00 sec)

Here, it’s the ordering of the whole result set that has changed. The Dense Rank and Rank columns didn’t change, in that, they still go from low at the first row, to high at the last row.

Note that an ORDER BY clause at the end of the query can affect the results.

Let’s add an ORDER BY clause to the end of the query:

SELECT 
    IdiotName,
    IQ,
    DENSE_RANK() OVER win AS "Dense Rank",
    RANK() OVER win AS "Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ )
ORDER BY IQ DESC;

Result:

+-------------------+------+------------+------+
| IdiotName         | IQ   | Dense Rank | Rank |
+-------------------+------+------------+------+
| Richard Watterson |   70 |          7 |   10 |
| Ralph Wiggum      |   65 |          6 |    9 |
| Dumb              |   60 |          5 |    8 |
| Cosmo             |   55 |          4 |    7 |
| Dumber            |   50 |          3 |    5 |
| Peter Griffin     |   50 |          3 |    5 |
| Homer             |   40 |          2 |    2 |
| Patrick Star      |   40 |          2 |    2 |
| Ed                |   40 |          2 |    2 |
| Dumbest           |   30 |          1 |    1 |
+-------------------+------+------------+------+
10 rows in set (0.00 sec)

Here, I removed the DESC from the window function and added ORDER BY IQ DESC to the end of the query. This results in the whole query being sorted in descending order while the window clause is sorted in ascending order.

Omitting the ORDER BY Clause

We should use the DENSE_RANK() function with an ORDER BY clause if we want meaningful results. Omitting the ORDER BY clause will cause all rows to be peers (and therefore they’ll all share the same rank).

Here’s what happens when we omit the ORDER BY clause:

SELECT 
    IdiotName,
    IQ,
    DENSE_RANK() OVER () AS "Dense Rank"
FROM Idiots;

Result:

+-------------------+------+------------+
| IdiotName         | IQ   | Dense Rank |
+-------------------+------+------------+
| Dumb              |   60 |          1 |
| Dumber            |   50 |          1 |
| Dumbest           |   30 |          1 |
| Homer             |   40 |          1 |
| Peter Griffin     |   50 |          1 |
| Patrick Star      |   40 |          1 |
| Ed                |   40 |          1 |
| Ralph Wiggum      |   65 |          1 |
| Cosmo             |   55 |          1 |
| Richard Watterson |   70 |          1 |
+-------------------+------+------------+
10 rows in set (0.00 sec)

All rows contain a dense rank of 1.

Partitions

We can use the PARTITION BY clause to partition the rowset by a given column. When we do this, DENSE_RANK() calculates the dense rank of the row within each partition. That is, the value is evaluated against the other rows in the same partition.

Suppose we have the following table:

SELECT * FROM Dogs;

Result:

+---------+-------------+-------+
| DogName | Activity    | Score |
+---------+-------------+-------+
| Bruno   | Fetch Stick |    43 |
| Cooper  | Fetch Stick |    67 |
| Max     | Fetch Stick |    91 |
| Bruno   | Wag Tail    |    51 |
| Cooper  | Wag Tail    |    51 |
| Max     | Wag Tail    |    87 |
| Bruno   | Keep Quiet  |     1 |
| Cooper  | Keep Quiet  |     8 |
| Max     | Keep Quiet  |    12 |
+---------+-------------+-------+
9 rows in set (0.00 sec)

We can use a PARTITION BY clause to get the dense rank of each dog within each activity:

SELECT 
    DogName,
    Activity,
    Score,
    DENSE_RANK() OVER ( PARTITION BY Activity ORDER BY Score ) AS "Rank"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Rank |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 |    1 |
| Cooper  | Fetch Stick |    67 |    2 |
| Max     | Fetch Stick |    91 |    3 |
| Bruno   | Keep Quiet  |     1 |    1 |
| Cooper  | Keep Quiet  |     8 |    2 |
| Max     | Keep Quiet  |    12 |    3 |
| Bruno   | Wag Tail    |    51 |    1 |
| Cooper  | Wag Tail    |    51 |    1 |
| Max     | Wag Tail    |    87 |    2 |
+---------+-------------+-------+------+
9 rows in set (0.00 sec)

Here’s an example that compares DENSE_RANK() to other similar functions:

SELECT 
    DogName,
    Activity,
    Score,
    DENSE_RANK() OVER win AS "Dense Rank",
    RANK() OVER win AS "Rank",
    PERCENT_RANK() OVER win AS "Percentage Rank",
    CUME_DIST() OVER win AS "Cumulative Distribution"
FROM Dogs
WINDOW win AS (PARTITION BY Activity ORDER BY Score);

Result:

+---------+-------------+-------+------------+------+-----------------+-------------------------+
| DogName | Activity    | Score | Dense Rank | Rank | Percentage Rank | Cumulative Distribution |
+---------+-------------+-------+------------+------+-----------------+-------------------------+
| Bruno   | Fetch Stick |    43 |          1 |    1 |               0 |      0.3333333333333333 |
| Cooper  | Fetch Stick |    67 |          2 |    2 |             0.5 |      0.6666666666666666 |
| Max     | Fetch Stick |    91 |          3 |    3 |               1 |                       1 |
| Bruno   | Keep Quiet  |     1 |          1 |    1 |               0 |      0.3333333333333333 |
| Cooper  | Keep Quiet  |     8 |          2 |    2 |             0.5 |      0.6666666666666666 |
| Max     | Keep Quiet  |    12 |          3 |    3 |               1 |                       1 |
| Bruno   | Wag Tail    |    51 |          1 |    1 |               0 |      0.6666666666666666 |
| Cooper  | Wag Tail    |    51 |          1 |    1 |               0 |      0.6666666666666666 |
| Max     | Wag Tail    |    87 |          2 |    3 |               1 |                       1 |
+---------+-------------+-------+------------+------+-----------------+-------------------------+
9 rows in set (0.00 sec)