An Introduction to the LEAD() Function in SQL

Many relational database management systems (RDBMSs) have a LEAD() function that allows us to retrieve a value from a following row.

The SQL LEAD() function returns the value of a given expression from the row that leads (follows) the current row by a given number of rows within its partition.

Example

Here’s an example to demonstrate:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 1 ) 
        OVER ( ORDER BY Score ) 
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Max     | Keep Quiet  |     1 |    2 |
| Bruno   | Keep Quiet  |     2 |    8 |
| Cooper  | Keep Quiet  |     8 |   43 |
| Bruno   | Fetch Stick |    43 |   51 |
| Cooper  | Wag Tail    |    51 |   65 |
| Bruno   | Wag Tail    |    65 |   67 |
| Cooper  | Fetch Stick |    67 |   87 |
| Max     | Wag Tail    |    87 |   91 |
| Max     | Fetch Stick |    91 | NULL |
+---------+-------------+-------+------+

Here, I’m using the SQL LEAD() function to return the score from the next row.

The first argument specifies the column and the second argument specifies the lead amount. My first argument to the function is Score, and the second argument is 1. Therefore in this case, the value comes from the next row’s Score column. It’s the next row because 1 means it goes forward one row.

Many/most RDBMSs provide a default lead value of 1, meaning that if we don’t provide the second argument, it will use 1. Therefore, it uses the next row by default.

We can see that the last row contains NULL in the Lead column. That’s because there are no more rows for which to retrieve a value. If there are no more rows, then there are no more values to retrieve, and so we get NULL. Some DBMSs allow us to specify a default value to use instead of NULL (example later).

Change the Lead Value

We can change the lead value to go forward more than one row:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 3 ) 
        OVER ( ORDER BY Score ) 
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Max     | Keep Quiet  |     1 |   43 |
| Bruno   | Keep Quiet  |     2 |   51 |
| Cooper  | Keep Quiet  |     8 |   65 |
| Bruno   | Fetch Stick |    43 |   67 |
| Cooper  | Wag Tail    |    51 |   87 |
| Bruno   | Wag Tail    |    65 |   91 |
| Cooper  | Fetch Stick |    67 | NULL |
| Max     | Wag Tail    |    87 | NULL |
| Max     | Fetch Stick |    91 | NULL |
+---------+-------------+-------+------+

This time I specified 3 as the lead value. This resulted in the Lead column returning the score from three rows forward. It also resulted in the last three rows being NULL, as expected.

Specifying a Default Value

Some RDBMSs allow us to change the default value to be something other than NULL. For example, in MySQL, we can provide this as a third argument:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 3, 'N/A' ) 
        OVER ( ORDER BY Score ) 
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Max     | Keep Quiet  |     1 | 43   |
| Bruno   | Keep Quiet  |     2 | 51   |
| Cooper  | Keep Quiet  |     8 | 65   |
| Bruno   | Fetch Stick |    43 | 67   |
| Cooper  | Wag Tail    |    51 | 87   |
| Bruno   | Wag Tail    |    65 | 91   |
| Cooper  | Fetch Stick |    67 | N/A  |
| Max     | Wag Tail    |    87 | N/A  |
| Max     | Fetch Stick |    91 | N/A  |
+---------+-------------+-------+------+

I passed the string N/A as the third argument, which means that it is now the default value. In other words, N/A will be returned whenever there’s no value.

Be aware that some RDBMSs (such as SQL Server) only allow us to use a default value that is type-compatible with the column specified in the first argument.

Running the above example in SQL Server produces the following error:

Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'N/A' to data type int.

The exact error may depend on the data types involved.

Anyway, in SQL Server, we would need to change our default value to be type-compatible with the column specified by the first argument. For example:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 3, 0 ) 
        OVER ( ORDER BY Score ) 
    AS "Lead"
FROM Dogs;

Result:

DogName  Activity     Score  Lead
-------  -----------  -----  ----
Max      Keep Quiet   1      43  
Bruno    Keep Quiet   2      51  
Cooper   Keep Quiet   8      65  
Bruno    Fetch Stick  43     67  
Cooper   Wag Tail     51     87  
Bruno    Wag Tail     65     91  
Cooper   Fetch Stick  67     0   
Max      Wag Tail     87     0   
Max      Fetch Stick  91     0   

Partitioning

We can partition the row set by a given column. This is common when using window functions in SQL. It allows us to apply the function to each partition, without other partitions affecting the result.

To do this, we use the PARTITION BY clause:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 1 ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score 
            )
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 |   67 |
| Cooper  | Fetch Stick |    67 |   91 |
| Max     | Fetch Stick |    91 | NULL |
| Max     | Keep Quiet  |     1 |    2 |
| Bruno   | Keep Quiet  |     2 |    8 |
| Cooper  | Keep Quiet  |     8 | NULL |
| Cooper  | Wag Tail    |    51 |   65 |
| Bruno   | Wag Tail    |    65 |   87 |
| Max     | Wag Tail    |    87 | NULL |
+---------+-------------+-------+------+

This time our LEAD() function is applied to each partition. Only the scores from the current partition are considered, and so we get NULL at the end of each partition.

Changing the Order

Here’s what happens when we change the order to descending order:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 1 ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score DESC
            )
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Max     | Fetch Stick |    91 |   67 |
| Cooper  | Fetch Stick |    67 |   43 |
| Bruno   | Fetch Stick |    43 | NULL |
| Cooper  | Keep Quiet  |     8 |    2 |
| Bruno   | Keep Quiet  |     2 |    1 |
| Max     | Keep Quiet  |     1 | NULL |
| Max     | Wag Tail    |    87 |   65 |
| Bruno   | Wag Tail    |    65 |   51 |
| Cooper  | Wag Tail    |    51 | NULL |
+---------+-------------+-------+------+

Omitting the ORDER BY Clause

Some RDBMSs require that the OVER clause includes an ORDER BY clause. Removing the ORDER BY clause in such RDBMSs results in an error.

Here’s what happens when I remove the ORDER BY clause in SQL Server:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( Score, 1 ) 
        OVER ( 
            PARTITION BY Activity
            )
    AS "Lead"
FROM Dogs;

Result:

Msg 4112, Level 15, State 1, Line 5
The function 'LEAD' must have an OVER clause with ORDER BY.

But other RDBMSs don’t have such a restriction. For example, running the above code in MySQL produces the following:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lead |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 |   67 |
| Cooper  | Fetch Stick |    67 |   91 |
| Max     | Fetch Stick |    91 | NULL |
| Bruno   | Keep Quiet  |     2 |    8 |
| Cooper  | Keep Quiet  |     8 |    1 |
| Max     | Keep Quiet  |     1 | NULL |
| Bruno   | Wag Tail    |    65 |   51 |
| Cooper  | Wag Tail    |    51 |   87 |
| Max     | Wag Tail    |    87 | NULL |
+---------+-------------+-------+------+

No error.

If you’re getting an error when omitting the ORDER BY clause, check the documentation for your RDBMS. It may be that it requires the ORDER BY clause.

Sorting by a Different Column

In the previous examples, the column that we pass to the LEAD() function is the same one we pass to the ORDER BY clause, but we don’t necessarily have to do this. We can sort by a different column to the one we’re returning.

Example:

SELECT 
    DogName,
    Activity,
    Score,
    LEAD( DogName, 1 ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score
            )
    AS "Lead"
FROM Dogs;

Result:

+---------+-------------+-------+--------+
| DogName | Activity    | Score | Lead   |
+---------+-------------+-------+--------+
| Bruno   | Fetch Stick |    43 | Cooper |
| Cooper  | Fetch Stick |    67 | Max    |
| Max     | Fetch Stick |    91 | NULL   |
| Max     | Keep Quiet  |     1 | Bruno  |
| Bruno   | Keep Quiet  |     2 | Cooper |
| Cooper  | Keep Quiet  |     8 | NULL   |
| Cooper  | Wag Tail    |    51 | Bruno  |
| Bruno   | Wag Tail    |    65 | Max    |
| Max     | Wag Tail    |    87 | NULL   |
+---------+-------------+-------+--------+

This time we returned the dog names instead of the scores, but we still sorted by the scores. So we still get a value from the following row, just from a different column.

DBMSs that include the LEAD() function also include a LAG() function that returns the value from the other direction (it returns the value from a previous row).