Overview of the LAG() Function in MySQL

In MySQL, the LAG() function is a window function that returns the value of a given expression from the row that lags (precedes) the current row by a given number of rows within its partition.

Basically, it returns the value from a previous row.

Syntax

The syntax goes like this:

LAG(expr [, N[, default]]) [null_treatment] over_clause

So it requires an OVER clause, which determines how the rowset is partitioned and ordered before the window function is applied.

The optional N argument specifies the lag value. For example, 3 means three rows back.

The optional default argument specifies what to return if there’s no value (this will happen when the current row number is less than the lag).

The optional null_treatment clause specifies how NULL values are treated. This clause is part of the SQL standard, but MySQL only allows RESPECT NULLS, which is also the default. RESPECT NULLS means that NULL values are considered when calculating results. On the other hand, IGNORE NULLS is parsed, but produces an error (at least as of MySQL 8.0.33).

Example

Here’s a basic example to demonstrate how it works:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score ) 
        OVER ( ORDER BY Score ) 
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Bruno   | Keep Quiet  |     1 | NULL |
| Cooper  | Keep Quiet  |     8 |    1 |
| Max     | Keep Quiet  |    12 |    8 |
| Bruno   | Fetch Stick |    43 |   12 |
| Bruno   | Wag Tail    |    51 |   43 |
| Cooper  | Wag Tail    |    51 |   51 |
| Cooper  | Fetch Stick |    67 |   51 |
| Max     | Wag Tail    |    87 |   67 |
| Max     | Fetch Stick |    91 |   87 |
+---------+-------------+-------+------+

We can see that the Lag column lags the Score column by one row, and the first row is NULL. That’s because I didn’t provide any arguments other than the expression/column to use. It therefore used the default lag value of 1 and a default value of NULL.

Change the Lag

Let’s change the lag value:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 3 ) 
        OVER ( ORDER BY Score ) 
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Bruno   | Keep Quiet  |     1 | NULL |
| Cooper  | Keep Quiet  |     8 | NULL |
| Max     | Keep Quiet  |    12 | NULL |
| Bruno   | Fetch Stick |    43 |    1 |
| Bruno   | Wag Tail    |    51 |    8 |
| Cooper  | Wag Tail    |    51 |   12 |
| Cooper  | Fetch Stick |    67 |   43 |
| Max     | Wag Tail    |    87 |   51 |
| Max     | Fetch Stick |    91 |   51 |
+---------+-------------+-------+------+

This time I specified 3 as the lag value. This resulted in the first three rows being NULL, and all subsequent rows returning the score from three rows back.

Specify a Default Value

We can change the default value to be something other than NULL. Here’s an example:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 3, 'N/A' ) 
        OVER ( ORDER BY Score ) 
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Bruno   | Keep Quiet  |     1 | N/A  |
| Cooper  | Keep Quiet  |     8 | N/A  |
| Max     | Keep Quiet  |    12 | N/A  |
| Bruno   | Fetch Stick |    43 | 1    |
| Bruno   | Wag Tail    |    51 | 8    |
| Cooper  | Wag Tail    |    51 | 12   |
| Cooper  | Fetch Stick |    67 | 43   |
| Max     | Wag Tail    |    87 | 51   |
| Max     | Fetch Stick |    91 | 51   |
+---------+-------------+-------+------+

This time I specified that the string N/A be returned whenever there’s no value.

Partitioning

This example uses a PARTITION BY clause to partition the results by activity:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 1, 'N/A' ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score 
            )
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 | N/A  |
| Cooper  | Fetch Stick |    67 | 43   |
| Max     | Fetch Stick |    91 | 67   |
| Bruno   | Keep Quiet  |     1 | N/A  |
| Cooper  | Keep Quiet  |     8 | 1    |
| Max     | Keep Quiet  |    12 | 8    |
| Bruno   | Wag Tail    |    51 | N/A  |
| Cooper  | Wag Tail    |    51 | 51   |
| Max     | Wag Tail    |    87 | 51   |
+---------+-------------+-------+------+

This has caused the lag to reset with each partition.

The null_treatment Clause

As mentioned, the optional null_treatment clause specifies how to treat NULL values. The default value for this clause is RESPECT NULLS. Therefore, the above example implicitly uses RESPECT NULLS.

Let’s explicitly specify RESPECT NULLS:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 1, 'N/A' ) RESPECT NULLS
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score 
            )
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Bruno   | Fetch Stick |    43 | N/A  |
| Cooper  | Fetch Stick |    67 | 43   |
| Max     | Fetch Stick |    91 | 67   |
| Bruno   | Keep Quiet  |     1 | N/A  |
| Cooper  | Keep Quiet  |     8 | 1    |
| Max     | Keep Quiet  |    12 | 8    |
| Bruno   | Wag Tail    |    51 | N/A  |
| Cooper  | Wag Tail    |    51 | 51   |
| Max     | Wag Tail    |    87 | 51   |
+---------+-------------+-------+------+

Specifying IGNORE NULLS produces an error (at least as of MySQL 8.0.33):

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 1, 'N/A' ) IGNORE NULLS
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score 
            )
    AS "Lag"
FROM Dogs;

Result:

ERROR 1235 (42000): This version of MySQL doesn't yet support 'IGNORE NULLS'

Changing the Order

Here’s what happens when we change the order to descending order:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( Score, 1, 'N/A' ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score DESC
            )
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+------+
| DogName | Activity    | Score | Lag  |
+---------+-------------+-------+------+
| Max     | Fetch Stick |    91 | N/A  |
| Cooper  | Fetch Stick |    67 | 91   |
| Bruno   | Fetch Stick |    43 | 67   |
| Max     | Keep Quiet  |    12 | N/A  |
| Cooper  | Keep Quiet  |     8 | 12   |
| Bruno   | Keep Quiet  |     1 | 8    |
| Max     | Wag Tail    |    87 | N/A  |
| Bruno   | Wag Tail    |    51 | 87   |
| Cooper  | Wag Tail    |    51 | 51   |
+---------+-------------+-------+------+

Passing a Different Column to LAG()

In the above examples, the column that we pass to the LAG() function is the same one we pass to the ORDER BY clause. But that’s not a requirement. We can pass a different column if required.

Here’s an example that uses two different columns for LAG() and the ORDER BY clause:

SELECT 
    DogName,
    Activity,
    Score,
    LAG( DogName, 1, 'N/A' ) 
        OVER ( 
            PARTITION BY Activity
            ORDER BY Score
            )
    AS "Lag"
FROM Dogs;

Result:

+---------+-------------+-------+--------+
| DogName | Activity    | Score | Lag    |
+---------+-------------+-------+--------+
| Bruno   | Fetch Stick |    43 | N/A    |
| Cooper  | Fetch Stick |    67 | Bruno  |
| Max     | Fetch Stick |    91 | Cooper |
| Bruno   | Keep Quiet  |     1 | N/A    |
| Cooper  | Keep Quiet  |     8 | Bruno  |
| Max     | Keep Quiet  |    12 | Cooper |
| Bruno   | Wag Tail    |    51 | N/A    |
| Cooper  | Wag Tail    |    51 | Bruno  |
| Max     | Wag Tail    |    87 | Cooper |
+---------+-------------+-------+--------+

Here, we passed the DogName column to the LAG() function, but we sorted by the Score column. This resulted in the dog names being displayed in the Lag column.

There’s also a LEAD() function that returns the value from the other direction (i.e. it returns the value from the row that follows the current row by the given amount).