Understanding the NTH_VALUE() Function in PostgreSQL

In PostgreSQL the nth_value() function is a window function that returns the value from the given row of the current window frame. We provide the column and row number as an argument when we call the function.

Syntax

The syntax goes like this:

nth_value ( value, n ) 

So we pass the column as the value argument, and we specify which row with the n argument.

Example

Here’s a quick example to demonstrate:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER ( ORDER BY productprice )
FROM products;

Result:

 vendorid |           productname           | productprice | nth_value 
----------+---------------------------------+--------------+-----------
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |      null
     1003 | Hammock                         |        10.00 |      null
     1001 | Long Weight (green)             |        11.99 |     11.99
     1004 | Tea Pot                         |        12.45 |     11.99
     1001 | Long Weight (blue)              |        14.75 |     11.99
     1001 | Left handed screwdriver         |        25.99 |     11.99
     1001 | Right handed screwdriver        |        25.99 |     11.99
     1002 | Sledge Hammer                   |        33.49 |     11.99
     1003 | Straw Dog Box                   |        55.99 |     11.99
     1003 | Chainsaw                        |       245.00 |     11.99

In this case we’re getting the value from the third row of the productprice column. That’s because I passed productprice as the first argument, and 3 as the second.

You may have noticed that the first two rows are null in the nth_value column. That’s because, by default, the window frame ends at the current row. Therefore, the window frame changes as each row is added to the results.

The reason is that the default frame is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

When we use an ORDER BY clause, this sets the frame to be all rows from the partition start up through the current row’s last ORDER BY peer. However, if we omit the ORDER BY clause, all rows of the partition are included in the window frame, because all rows become peers of the current row.

Removing the ORDER BY Clause

As mentioned, omitting the ORDER BY clause causes all rows to become peers. Therefore, if we remove the ORDER BY clause from the above example, all rows are considered peers of the current row, and the first two rows are no longer null:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER (  )
FROM products;

Result:

 vendorid |           productname           | productprice | nth_value 
----------+---------------------------------+--------------+-----------
     1001 | Left handed screwdriver         |        25.99 |     14.75
     1001 | Right handed screwdriver        |        25.99 |     14.75
     1001 | Long Weight (blue)              |        14.75 |     14.75
     1001 | Long Weight (green)             |        11.99 |     14.75
     1002 | Sledge Hammer                   |        33.49 |     14.75
     1003 | Chainsaw                        |       245.00 |     14.75
     1003 | Straw Dog Box                   |        55.99 |     14.75
     1003 | Hammock                         |        10.00 |     14.75
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |     14.75
     1004 | Tea Pot                         |        12.45 |     14.75

Changing the ORDER BY Clause

Let’s reinstate the ORDER BY clause, but explicitly sort by the vendorid column:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER ( ORDER BY vendorid )
FROM products;

Result:

 vendorid |           productname           | productprice | nth_value 
----------+---------------------------------+--------------+-----------
     1001 | Left handed screwdriver         |        25.99 |     14.75
     1001 | Right handed screwdriver        |        25.99 |     14.75
     1001 | Long Weight (blue)              |        14.75 |     14.75
     1001 | Long Weight (green)             |        11.99 |     14.75
     1002 | Sledge Hammer                   |        33.49 |     14.75
     1003 | Chainsaw                        |       245.00 |     14.75
     1003 | Straw Dog Box                   |        55.99 |     14.75
     1003 | Hammock                         |        10.00 |     14.75
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |     14.75
     1004 | Tea Pot                         |        12.45 |     14.75

In this case we got the same result as the unordered query. But that’s only because the unordered query happened to be in the same order.

Let’s change it to descending order:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER ( ORDER BY vendorid DESC )
FROM products;

Result:

 vendorid |           productname           | productprice | nth_value 
----------+---------------------------------+--------------+-----------
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |      null
     1004 | Tea Pot                         |        12.45 |      null
     1003 | Chainsaw                        |       245.00 |    245.00
     1003 | Straw Dog Box                   |        55.99 |    245.00
     1003 | Hammock                         |        10.00 |    245.00
     1002 | Sledge Hammer                   |        33.49 |    245.00
     1001 | Left handed screwdriver         |        25.99 |    245.00
     1001 | Long Weight (blue)              |        14.75 |    245.00
     1001 | Right handed screwdriver        |        25.99 |    245.00
     1001 | Long Weight (green)             |        11.99 |    245.00

This time we get a different result, and the first two rows are again null.

Change the Window Frame

As mentioned, the default frame is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Here’s the same query from the first example, but this time it uses a frame clause to change the window frame:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER ( 
        ORDER BY productprice
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        )
FROM products;

Result:

 vendorid |           productname           | productprice | nth_value 
----------+---------------------------------+--------------+-----------
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |     11.99
     1003 | Hammock                         |        10.00 |     11.99
     1001 | Long Weight (green)             |        11.99 |     11.99
     1004 | Tea Pot                         |        12.45 |     11.99
     1001 | Long Weight (blue)              |        14.75 |     11.99
     1001 | Left handed screwdriver         |        25.99 |     11.99
     1001 | Right handed screwdriver        |        25.99 |     11.99
     1002 | Sledge Hammer                   |        33.49 |     11.99
     1003 | Straw Dog Box                   |        55.99 |     11.99
     1003 | Chainsaw                        |       245.00 |     11.99

This time we get the last value of the whole result set. That’s because I specified ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING as the frame clause. It’s the UNBOUNDED FOLLOWING part that changed the result.

Just to be clear, here it is with and without the frame clause:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 ) OVER ( 
        ORDER BY productprice
        ) AS "Frame 1",
    nth_value( productprice, 3 ) OVER ( 
        ORDER BY productprice
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        ) AS "Frame 2"
FROM products;

Result:

 vendorid |           productname           | productprice | Frame 1 | Frame 2 
----------+---------------------------------+--------------+---------+---------
     1004 | Bottomless Coffee Mugs (4 Pack) |         9.99 |    null |   11.99
     1003 | Hammock                         |        10.00 |    null |   11.99
     1001 | Long Weight (green)             |        11.99 |   11.99 |   11.99
     1004 | Tea Pot                         |        12.45 |   11.99 |   11.99
     1001 | Long Weight (blue)              |        14.75 |   11.99 |   11.99
     1001 | Left handed screwdriver         |        25.99 |   11.99 |   11.99
     1001 | Right handed screwdriver        |        25.99 |   11.99 |   11.99
     1002 | Sledge Hammer                   |        33.49 |   11.99 |   11.99
     1003 | Straw Dog Box                   |        55.99 |   11.99 |   11.99
     1003 | Chainsaw                        |       245.00 |   11.99 |   11.99

Partition the Results

Let’s add a PARTITION BY clause to partition by vendor:

SELECT 
    v.vendorname,
    p.productname,
    p.productprice,
    nth_value( p.productprice, 2 ) OVER ( 
        PARTITION BY p.vendorid 
        ORDER BY p.productprice
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        )
FROM products p INNER JOIN 
vendors v ON p.vendorid = v.vendorid;

Result:

  vendorname   |           productname           | productprice | nth_value 
---------------+---------------------------------+--------------+-----------
 Mars Supplies | Long Weight (green)             |        11.99 |     14.75
 Mars Supplies | Long Weight (blue)              |        14.75 |     14.75
 Mars Supplies | Right handed screwdriver        |        25.99 |     14.75
 Mars Supplies | Left handed screwdriver         |        25.99 |     14.75
 Randy Roofers | Sledge Hammer                   |        33.49 |      null
 Pedal Medals  | Hammock                         |        10.00 |     55.99
 Pedal Medals  | Straw Dog Box                   |        55.99 |     55.99
 Pedal Medals  | Chainsaw                        |       245.00 |     55.99
 Katty Kittens | Bottomless Coffee Mugs (4 Pack) |         9.99 |     12.45
 Katty Kittens | Tea Pot                         |        12.45 |     12.45

We can see that the nth_value column resets with each new partition. I included UNBOUNDED FOLLOWING in the frame clause so that, for any given row, the current window frame encompasses all rows in the partition.

We can see that the null value is in the partition that only has one row. That’s because there’s no second row for which to get a value. The other partitions have at least two rows, and so we get a non-null value for those rows.

I also ran a SQL join between the products table and the vendors table in order to get each vendor’s name instead of just the ID.

Removing the OVER Clause

The OVER clause is required when using nonaggregate window functions such as nth_value(). Removing it results in an error:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3 )
FROM products;

Result:

ERROR:  window function nth_value requires an OVER clause
LINE 5:     nth_value( productprice, 3 )
            ^

Omitting an Argument

Omitting any of the arguments also causes an error:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice ) OVER ( ORDER BY productprice )
FROM products;

Result:

ERROR:  function nth_value(numeric) does not exist
LINE 5:     nth_value( productprice ) OVER ( ORDER BY productprice )
            ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

And:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( 3 ) OVER ( ORDER BY productprice )
FROM products;

Result:

ERROR:  function nth_value(integer) does not exist
LINE 5:     nth_value( 3 ) OVER ( ORDER BY productprice )
            ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

So we need to pass two arguments.

Wrong Argument Type

The second argument must be an integer type. Here’s what happens when we pass the wrong type:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 3.5 ) OVER ( ORDER BY productprice )
FROM products;

Result:

ERROR:  function nth_value(numeric, numeric) does not exist
LINE 5:     nth_value( productprice, 3.5 ) OVER ( ORDER BY productpr...
            ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

And here’s another one:

SELECT
    vendorid,
    productname,
    productprice,
    nth_value( productprice, 'three' ) OVER ( ORDER BY productprice )
FROM products;

Result:

ERROR:  invalid input syntax for type integer: "three"
LINE 5:     nth_value( productprice, 'three' ) OVER ( ORDER BY produ...
                                     ^