Understanding the RANK() Function in MySQL

In MySQL, RANK() is a window function that returns the rank of the current row within its partition, with gaps.

By “gaps” this means that it returns noncontiguous rank numbers whenever there are peers. Peers are considered ties and receive the same rank, but in such cases, we get a gap between this rank value and the next rank value.

This is in contrast to the DENSE_RANK() function that returns contiguous rank numbers (i.e. it assigns peers the same rank value, but the next rank value is one greater, without any gaps).

Syntax

The syntax goes like this:

RANK() over_clause

So it requires an OVER clause, which determines how the rowset is partitioned and ordered before the window function is applied.

Example

Suppose we have a table with the following data:

SELECT * FROM Idiots;

Result:

+---------+-------------------+------+
| IdiotId | IdiotName         | IQ   |
+---------+-------------------+------+
|       1 | Dumb              |   60 |
|       2 | Dumber            |   50 |
|       3 | Dumbest           |   30 |
|       4 | Homer             |   40 |
|       5 | Peter Griffin     |   50 |
|       6 | Patrick Star      |   40 |
|       7 | Ed                |   40 |
|       8 | Ralph Wiggum      |   65 |
|       9 | Cosmo             |   55 |
|      10 | Richard Watterson |   70 |
+---------+-------------------+------+

The following query uses RANK() against the IQ column:

SELECT 
    IdiotName,
    IQ,
    RANK() OVER ( ORDER BY IQ ) AS "Rank"
FROM Idiots;

Result:

+-------------------+------+------+
| IdiotName         | IQ   | Rank |
+-------------------+------+------+
| Dumbest           |   30 |    1 |
| Homer             |   40 |    2 |
| Patrick Star      |   40 |    2 |
| Ed                |   40 |    2 |
| Dumber            |   50 |    5 |
| Peter Griffin     |   50 |    5 |
| Cosmo             |   55 |    7 |
| Dumb              |   60 |    8 |
| Ralph Wiggum      |   65 |    9 |
| Richard Watterson |   70 |   10 |
+-------------------+------+------+
10 rows in set (0.02 sec)

Here, the Rank column contains the output of the RANK() function.

In this case we have two instances of a tie. Three idiots have an IQ of 40 and two idiots have an IQ of 50. We can see that those with the same IQ share the same rank. We can also see that there’s a gap between this rank value and the next one. For example, the rank values jump from 2 to 5 (it skips 3 and 4). This is because three idiots shared rank 2. Then the rank jumps from 5 to 7 (it skips 6). That’s because two idiots shared the rank of 5.

These gaps are an intentional part of the RANK() function’s design. By contrast, the DENSE_RANK() function doesn’t incorporate gaps in the ranking. Its ranking flows on from the previous rank without any gaps. Therefore, if you don’t want gaps, use DENSE_RANK() instead of RANK().

To demonstrate this more clearly, let’s add a DENSE_RANK() column to our query:

SELECT 
    IdiotName,
    IQ,
    RANK() OVER win AS "Rank",
    DENSE_RANK() OVER win AS "Dense Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ );

Result:

+-------------------+------+------+------------+
| IdiotName         | IQ   | Rank | Dense Rank |
+-------------------+------+------+------------+
| Dumbest           |   30 |    1 |          1 |
| Homer             |   40 |    2 |          2 |
| Patrick Star      |   40 |    2 |          2 |
| Ed                |   40 |    2 |          2 |
| Dumber            |   50 |    5 |          3 |
| Peter Griffin     |   50 |    5 |          3 |
| Cosmo             |   55 |    7 |          4 |
| Dumb              |   60 |    8 |          5 |
| Ralph Wiggum      |   65 |    9 |          6 |
| Richard Watterson |   70 |   10 |          7 |
+-------------------+------+------+------------+
10 rows in set (0.01 sec)

In this example I moved the ORDER BY clause to a named window. This saved me from having to replicate the same clause multiple times in the SELECT list. I was able to simply provide the name of the named window (in this case, win).

Anyway, we can see that DENSE_RANK() returns 3 when RANK() returns 5, which affects all subsequent rank values. And after the next tie, DENSE_RANK() returns 4 when RANK() returns 7, which further affects all subsequent values.

Reorder the Rankings

We can also order the results differently. The above examples order the results by the IQ column in ascending order. In those examples, the idiot with the lowest IQ gets the highest rank.

Let’s switch this around so that the idiot with the lowest IQ gets the lowest rank:

SELECT 
    IdiotName,
    IQ,
    RANK() OVER win AS "Rank",
    DENSE_RANK() OVER win AS "Dense Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ DESC );

Result:

+-------------------+------+------+------------+
| IdiotName         | IQ   | Rank | Dense Rank |
+-------------------+------+------+------------+
| Richard Watterson |   70 |    1 |          1 |
| Ralph Wiggum      |   65 |    2 |          2 |
| Dumb              |   60 |    3 |          3 |
| Cosmo             |   55 |    4 |          4 |
| Dumber            |   50 |    5 |          5 |
| Peter Griffin     |   50 |    5 |          5 |
| Homer             |   40 |    7 |          6 |
| Patrick Star      |   40 |    7 |          6 |
| Ed                |   40 |    7 |          6 |
| Dumbest           |   30 |   10 |          7 |
+-------------------+------+------+------------+
10 rows in set (0.01 sec)

Note that an ORDER BY clause at the end of the query can affect the results too. Let’s add an ORDER BY clause to the end of the query to see how that affects the result:

SELECT 
    IdiotName,
    IQ,
    RANK() OVER win AS "Rank",
    DENSE_RANK() OVER win AS "Dense Rank"
FROM Idiots
WINDOW win AS ( ORDER BY IQ DESC )
ORDER BY IQ ASC;

Result:

+-------------------+------+------+------------+
| IdiotName         | IQ   | Rank | Dense Rank |
+-------------------+------+------+------------+
| Dumbest           |   30 |   10 |          7 |
| Homer             |   40 |    7 |          6 |
| Patrick Star      |   40 |    7 |          6 |
| Ed                |   40 |    7 |          6 |
| Dumber            |   50 |    5 |          5 |
| Peter Griffin     |   50 |    5 |          5 |
| Cosmo             |   55 |    4 |          4 |
| Dumb              |   60 |    3 |          3 |
| Ralph Wiggum      |   65 |    2 |          2 |
| Richard Watterson |   70 |    1 |          1 |
+-------------------+------+------+------------+
10 rows in set (0.00 sec)

In this case the ORDER BY clause at the end of the query orders the result set in ascending order, while the window function is in descending order.

Omitting the ORDER BY Clause

We should use the RANK() function with an ORDER BY clause if we want meaningful results. Omitting the ORDER BY clause will cause all rows to be peers (and therefore they’ll all share the same rank).

Here’s what happens when we omit the ORDER BY clause:

SELECT 
    IdiotName,
    IQ,
    RANK() OVER () AS "Rank"
FROM Idiots;

Result:

+-------------------+------+------+
| IdiotName         | IQ   | Rank |
+-------------------+------+------+
| Dumb              |   60 |    1 |
| Dumber            |   50 |    1 |
| Dumbest           |   30 |    1 |
| Homer             |   40 |    1 |
| Peter Griffin     |   50 |    1 |
| Patrick Star      |   40 |    1 |
| Ed                |   40 |    1 |
| Ralph Wiggum      |   65 |    1 |
| Cosmo             |   55 |    1 |
| Richard Watterson |   70 |    1 |
+-------------------+------+------+
10 rows in set (0.00 sec)

All rows contain a rank of 1.

Partitions

We can use the PARTITION BY clause to partition the rowset by a given column. When we do this, RANK() calculates the rank of the row within each partition. That is, the value is evaluated against the other rows in the same partition.

Suppose we have the following table:

SELECT * FROM Dogs;

Result:

+---------+-------------+-------+
| DogName | Activity    | Score |
+---------+-------------+-------+
| Bruno   | Fetch Stick |    43 |
| Cooper  | Fetch Stick |    67 |
| Max     | Fetch Stick |    91 |
| Bruno   | Wag Tail    |    51 |
| Cooper  | Wag Tail    |    51 |
| Max     | Wag Tail    |    87 |
| Bruno   | Keep Quiet  |     1 |
| Cooper  | Keep Quiet  |     8 |
| Max     | Keep Quiet  |    12 |
+---------+-------------+-------+
9 rows in set (0.00 sec)

We can use a PARTITION BY clause to find out the rank of each dog within each activity:

SELECT 
    DogName,
    Activity,
    Score,
    RANK() OVER ( PARTITION BY Activity ORDER BY Score ) AS "Rank"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Rank |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 |    1 |
| Cooper  | Fetch Stick |    67 |    2 |
| Max     | Fetch Stick |    91 |    3 |
| Bruno   | Keep Quiet  |     1 |    1 |
| Cooper  | Keep Quiet  |     8 |    2 |
| Max     | Keep Quiet  |    12 |    3 |
| Bruno   | Wag Tail    |    51 |    1 |
| Cooper  | Wag Tail    |    51 |    1 |
| Max     | Wag Tail    |    87 |    3 |
+---------+-------------+-------+------+
9 rows in set (0.00 sec)

Here’s an example that demonstrates the range of results we can get with other similar functions:

SELECT 
    DogName,
    Activity,
    Score,
    RANK() OVER win AS "Rank",
    PERCENT_RANK() OVER win AS "Percentage Rank",
    DENSE_RANK() OVER win AS "Dense Rank",
    CUME_DIST() OVER win AS "Cumulative Distribution"
FROM Dogs
WINDOW win AS (PARTITION BY Activity ORDER BY Score);

Result:

+---------+-------------+-------+------+-----------------+------------+-------------------------+
| DogName | Activity    | Score | Rank | Percentage Rank | Dense Rank | Cumulative Distribution |
+---------+-------------+-------+------+-----------------+------------+-------------------------+
| Bruno   | Fetch Stick |    43 |    1 |               0 |          1 |      0.3333333333333333 |
| Cooper  | Fetch Stick |    67 |    2 |             0.5 |          2 |      0.6666666666666666 |
| Max     | Fetch Stick |    91 |    3 |               1 |          3 |                       1 |
| Bruno   | Keep Quiet  |     1 |    1 |               0 |          1 |      0.3333333333333333 |
| Cooper  | Keep Quiet  |     8 |    2 |             0.5 |          2 |      0.6666666666666666 |
| Max     | Keep Quiet  |    12 |    3 |               1 |          3 |                       1 |
| Bruno   | Wag Tail    |    51 |    1 |               0 |          1 |      0.6666666666666666 |
| Cooper  | Wag Tail    |    51 |    1 |               0 |          1 |      0.6666666666666666 |
| Max     | Wag Tail    |    87 |    3 |               1 |          2 |                       1 |
+---------+-------------+-------+------+-----------------+------------+-------------------------+
9 rows in set (0.01 sec)