A Quick Introduction to the NTILE() Function in SQL

Many SQL databases have a window function called NTILE() function that divides a rowset or partition into a given number of groups (buckets). The function typically returns the bucket number of the current row within its partition.

Example

Here’s a basic example to demonstrate how the NTILE() function works in SQL:

SELECT 
    DogName,
    Activity,
    Score,
    NTILE( 4 ) 
        OVER ( ORDER BY Score ) 
    AS "NTILE"
FROM Dogs;

Result:

+---------+-------------+-------+-------+
| DogName | Activity    | Score | NTILE |
+---------+-------------+-------+-------+
| Max     | Keep Quiet  |     1 |     1 |
| Bruno   | Keep Quiet  |     2 |     1 |
| Cooper  | Keep Quiet  |     8 |     1 |
| Bruno   | Fetch Stick |    43 |     2 |
| Cooper  | Wag Tail    |    51 |     2 |
| Bruno   | Wag Tail    |    65 |     3 |
| Cooper  | Fetch Stick |    67 |     3 |
| Max     | Wag Tail    |    87 |     4 |
| Max     | Fetch Stick |    91 |     4 |
+---------+-------------+-------+-------+

I specified 4 as the number of buckets, and I used an OVER clause with an ORDER BY clause to specify the ordering.

I didn’t apply any partitioning in this example, and so the number of buckets is applied to the whole rowset.

The OVER clause is pretty much a requirement with SQL window functions. The ORDER BY clause may be required or it may be optional, depending on your DBMS.

Partitioning

As mentioned, the above example doesn’t use partitions – the NTILE() function is applied to the whole rowset.

In SQL we use the PARTITION BY clause to create partitions.

This example uses partitions:

SELECT 
    DogName,
    Activity,
    Score,
    NTILE( 2 ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score 
            ) 
    AS "NTILE"
FROM Dogs;

Result:

+---------+-------------+-------+-------+
| DogName | Activity    | Score | NTILE |
+---------+-------------+-------+-------+
| Bruno   | Fetch Stick |    43 |     1 |
| Cooper  | Fetch Stick |    67 |     1 |
| Max     | Fetch Stick |    91 |     2 |
| Max     | Keep Quiet  |     1 |     1 |
| Bruno   | Keep Quiet  |     2 |     1 |
| Cooper  | Keep Quiet  |     8 |     2 |
| Cooper  | Wag Tail    |    51 |     1 |
| Bruno   | Wag Tail    |    65 |     1 |
| Max     | Wag Tail    |    87 |     2 |
+---------+-------------+-------+-------+

In this case I partitioned the results by activity. There are three activities, and so the results are partitioned into three partitions – one partition per activity.

Changing the Order

Here’s what happens when we change the order to descending order:

SELECT 
    DogName,
    Activity,
    Score,
    NTILE( 2 ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score DESC
            ) 
    AS "NTILE"
FROM Dogs;

Result:

+---------+-------------+-------+-------+
| DogName | Activity    | Score | NTILE |
+---------+-------------+-------+-------+
| Max     | Fetch Stick |    91 |     1 |
| Cooper  | Fetch Stick |    67 |     1 |
| Bruno   | Fetch Stick |    43 |     2 |
| Cooper  | Keep Quiet  |     8 |     1 |
| Bruno   | Keep Quiet  |     2 |     1 |
| Max     | Keep Quiet  |     1 |     2 |
| Max     | Wag Tail    |    87 |     1 |
| Bruno   | Wag Tail    |    65 |     1 |
| Cooper  | Wag Tail    |    51 |     2 |
+---------+-------------+-------+-------+

We still get two buckets, but some rows are assigned different bucket numbers due to the different ordering.

Omitting the ORDER BY Clause

Some DBMSs allow us to omit the ORDER BY clause and some don’t.

For example, MySQL allows us to omit it:

SELECT 
    DogName,
    Activity,
    Score,
    NTILE( 4 ) 
        OVER ( ) 
    AS "NTILE"
FROM Dogs;

Result:

+---------+-------------+-------+-------+
| DogName | Activity    | Score | NTILE |
+---------+-------------+-------+-------+
| Bruno   | Fetch Stick |    43 |     1 |
| Cooper  | Fetch Stick |    67 |     1 |
| Max     | Fetch Stick |    91 |     1 |
| Bruno   | Wag Tail    |    65 |     2 |
| Cooper  | Wag Tail    |    51 |     2 |
| Max     | Wag Tail    |    87 |     3 |
| Bruno   | Keep Quiet  |     2 |     3 |
| Cooper  | Keep Quiet  |     8 |     4 |
| Max     | Keep Quiet  |     1 |     4 |
+---------+-------------+-------+-------+

MySQL returns its results without error.

But SQL Server won’t let us omit it. Here’s the error returned when I run the above code in SQL Server:

Msg 4112, Level 15, State 1, Line 5
The function 'NTILE' must have an OVER clause with ORDER BY.

SQL Server returns the above error in most window functions when we omit the ORDER BY clause. The easiest way to overcome this is to include an ORDER BY clause.