Understanding the FIRST_VALUE() Function in SQL

In SQL databases, the FIRST_VALUE() function is a window function that returns the first value in an ordered set of values.

The exact syntax will probably depend on your DBMS, but it will usually require an OVER clause to determine how the rowset is partitioned and ordered before the window function is applied.

Example

Here’s a basic example to demonstrate how the SQL FIRST_VALUE() function works:

SELECT 
    DogName,
    Activity,
    Score,
    FIRST_VALUE( Score ) 
        OVER ( ORDER BY Score ROWS UNBOUNDED PRECEDING ) 
    AS "First Value"
FROM Dogs;

Result:

+---------+-------------+-------+-------------+
| DogName | Activity    | Score | First Value |
+---------+-------------+-------+-------------+
| Max     | Keep Quiet  |     1 |           1 |
| Bruno   | Keep Quiet  |     2 |           1 |
| Cooper  | Keep Quiet  |     8 |           1 |
| Bruno   | Fetch Stick |    43 |           1 |
| Cooper  | Wag Tail    |    51 |           1 |
| Bruno   | Wag Tail    |    65 |           1 |
| Cooper  | Fetch Stick |    67 |           1 |
| Max     | Wag Tail    |    87 |           1 |
| Max     | Fetch Stick |    91 |           1 |
+---------+-------------+-------+-------------+

Here, we use the FIRST_VALUE() function to get the first value from the Score column. In this case the first value is 1, and so that’s the value that’s returned for all rows in the partition/result set.

In this example we didn’t specify any partitions, and so the whole result set is treated as one big partition.

Partitioning

We can use the PARTITION BY clause to separate the results into partitions. When we do this, the FIRST_VALUE() function returns the first value for each partition:

SELECT 
    DogName,
    Activity,
    Score,
    FIRST_VALUE( Score ) OVER (
        PARTITION BY Activity 
        ORDER BY Score 
        ROWS UNBOUNDED PRECEDING
    ) AS "First Value"
FROM Dogs;

Result:

+---------+-------------+-------+-------------+
| DogName | Activity    | Score | First Value |
+---------+-------------+-------+-------------+
| Bruno   | Fetch Stick |    43 |          43 |
| Cooper  | Fetch Stick |    67 |          43 |
| Max     | Fetch Stick |    91 |          43 |
| Max     | Keep Quiet  |     1 |           1 |
| Bruno   | Keep Quiet  |     2 |           1 |
| Cooper  | Keep Quiet  |     8 |           1 |
| Cooper  | Wag Tail    |    51 |          51 |
| Bruno   | Wag Tail    |    65 |          51 |
| Max     | Wag Tail    |    87 |          51 |
+---------+-------------+-------+-------------+

Here, we partitioned the results by the Activity column, and so the FIRST_VALUE() function returned a different value for each partition. Each partition started with a new value, and so the values in the First Value column reflect that.

Named Windows

We can use a WINDOW clause to define a named window:

SELECT 
    DogName,
    Activity,
    Score,
    FIRST_VALUE( Score ) OVER win AS "First Value"
FROM Dogs
WINDOW win AS ( 
    PARTITION BY Activity 
    ORDER BY Score 
    ROWS UNBOUNDED PRECEDING
    );

Result:

+---------+-------------+-------+-------------+
| DogName | Activity    | Score | First Value |
+---------+-------------+-------+-------------+
| Bruno   | Fetch Stick |    43 |          43 |
| Cooper  | Fetch Stick |    67 |          43 |
| Max     | Fetch Stick |    91 |          43 |
| Max     | Keep Quiet  |     1 |           1 |
| Bruno   | Keep Quiet  |     2 |           1 |
| Cooper  | Keep Quiet  |     8 |           1 |
| Cooper  | Wag Tail    |    51 |          51 |
| Bruno   | Wag Tail    |    65 |          51 |
| Max     | Wag Tail    |    87 |          51 |
+---------+-------------+-------+-------------+

Using named windows allows us to move the window’s definition away from the OVER clause, and to the bottom of the query. This can be handy if our SELECT list contains multiple window functions that use the same definition.

Your ability to use a named window may depend on your DBMS. For example, at the time of writing, SQL Server has only recently introduced this capability (in SQL Server 2022). Trying to use the WINDOW clause in earlier versions will result in an error.

Changing the Order

The actual value returned by the FIRST_VALUE() function is dependent on the ORDER BY clause. In the previous examples, we sorted the partitions by the Score column in ascending order.

Let’s change the order to descending order:

SELECT 
    DogName,
    Activity,
    Score,
    FIRST_VALUE( Score ) OVER win AS "First Value"
FROM Dogs
WINDOW win AS ( 
    PARTITION BY Activity 
    ORDER BY Score DESC
    ROWS UNBOUNDED PRECEDING
    );

Result:

+---------+-------------+-------+-------------+
| DogName | Activity    | Score | First Value |
+---------+-------------+-------+-------------+
| Max     | Fetch Stick |    91 |          91 |
| Cooper  | Fetch Stick |    67 |          91 |
| Bruno   | Fetch Stick |    43 |          91 |
| Cooper  | Keep Quiet  |     8 |           8 |
| Bruno   | Keep Quiet  |     2 |           8 |
| Max     | Keep Quiet  |     1 |           8 |
| Max     | Wag Tail    |    87 |          87 |
| Bruno   | Wag Tail    |    65 |          87 |
| Cooper  | Wag Tail    |    51 |          87 |
+---------+-------------+-------+-------------+

As expected, the highest value is now the first value in each partition.

Passing a Different Column to FIRST_VALUE()

We can return a different column than the one specified in the ORDER BY clause:

SELECT 
    District, 
    Name, 
    Population AS "City Population", 
    FIRST_VALUE( Name ) OVER win AS "First Value"
FROM City 
WHERE CountryCode = 'AUS'
WINDOW win AS ( 
    PARTITION BY District 
    ORDER BY Population DESC
    ROWS UNBOUNDED PRECEDING
    );

Result:

+-----------------+---------------+-----------------+-------------+
| District        | Name          | City Population | First Value |
+-----------------+---------------+-----------------+-------------+
| Capital Region  | Canberra      |          322723 | Canberra    |
| New South Wales | Sydney        |         3276207 | Sydney      |
| New South Wales | Newcastle     |          270324 | Sydney      |
| New South Wales | Central Coast |          227657 | Sydney      |
| New South Wales | Wollongong    |          219761 | Sydney      |
| Queensland      | Brisbane      |         1291117 | Brisbane    |
| Queensland      | Gold Coast    |          311932 | Brisbane    |
| Queensland      | Townsville    |          109914 | Brisbane    |
| Queensland      | Cairns        |           92273 | Brisbane    |
| South Australia | Adelaide      |          978100 | Adelaide    |
| Tasmania        | Hobart        |          126118 | Hobart      |
| Victoria        | Melbourne     |         2865329 | Melbourne   |
| Victoria        | Geelong       |          125382 | Melbourne   |
| West Australia  | Perth         |         1096829 | Perth       |
+-----------------+---------------+-----------------+-------------+
14 rows in set (0.01 sec)

This table contains Australian cities and their populations, as well as the district/state that each city belongs.

In this case we sorted the row set so that the city with the highest population in its state comes first. But this time we return the city name in the First Value column instead of the population.

In other words, we returned the name of the city with the highest population in its district.

Treatment of NULL Values

The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for some window functions, such as FIRST_VALUE().

Here’s an example of explicitly specifying RESPECT NULLS:

SELECT 
    District, 
    Name, 
    Population AS "City Population", 
    FIRST_VALUE( Name ) RESPECT NULLS OVER win AS "First Value"
FROM City 
WHERE CountryCode = 'AUS'
WINDOW win AS ( 
    PARTITION BY District 
    ORDER BY Population DESC
    ROWS UNBOUNDED PRECEDING
    );

Result:

+-----------------+---------------+-----------------+-------------+
| District        | Name          | City Population | First Value |
+-----------------+---------------+-----------------+-------------+
| Capital Region  | Canberra      |          322723 | Canberra    |
| New South Wales | Sydney        |         3276207 | Sydney      |
| New South Wales | Newcastle     |          270324 | Sydney      |
| New South Wales | Central Coast |          227657 | Sydney      |
| New South Wales | Wollongong    |          219761 | Sydney      |
| Queensland      | Brisbane      |         1291117 | Brisbane    |
| Queensland      | Gold Coast    |          311932 | Brisbane    |
| Queensland      | Townsville    |          109914 | Brisbane    |
| Queensland      | Cairns        |           92273 | Brisbane    |
| South Australia | Adelaide      |          978100 | Adelaide    |
| Tasmania        | Hobart        |          126118 | Hobart      |
| Victoria        | Melbourne     |         2865329 | Melbourne   |
| Victoria        | Geelong       |          125382 | Melbourne   |
| West Australia  | Perth         |         1096829 | Perth       |
+-----------------+---------------+-----------------+-------------+

We didn’t actually have any NULL values in our data set, and so this option is irrelevant in our case. But at least it’s an option.

The RESPECT NULLS/IGNORE NULLS option may or may not be supported by your DBMS. Some DBMSs (such as MySQL and PostgreSQL) support the RESPECT NULLS option, but not the IGNORE NULLS option. SQL Server introduced support for RESPECT NULLS and IGNORE NULLS in SQL Server 2022.

Here’s an example of the error returned by MySQL 8.0.33 when we specify IGNORE NULLS:

SELECT 
    District, 
    Name, 
    Population AS "City Population", 
    FIRST_VALUE( Name ) IGNORE NULLS OVER win AS "First Value"
FROM City 
WHERE CountryCode = 'AUS'
WINDOW win AS ( 
    PARTITION BY District 
    ORDER BY Population DESC
    ROWS UNBOUNDED PRECEDING
    );

Result:

ERROR 1235 (42000): This version of MySQL doesn't yet support 'IGNORE NULLS'

By the way, I ran the above RESPECT NULLS example against the same MySQL 8.0.33 installation without error. So in this case, MySQL allows us to use RESPECT NULLS (which is also the default), but not IGNORE NULLS.