In PostgreSQL the nth_value()
function is a window function that returns the value from the given row of the current window frame. We provide the column and row number as an argument when we call the function.
Syntax
The syntax goes like this:
nth_value ( value, n )
So we pass the column as the value
argument, and we specify which row with the n
argument.
Example
Here’s a quick example to demonstrate:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER ( ORDER BY productprice )
FROM products;
Result:
vendorid | productname | productprice | nth_value ----------+---------------------------------+--------------+----------- 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | null 1003 | Hammock | 10.00 | null 1001 | Long Weight (green) | 11.99 | 11.99 1004 | Tea Pot | 12.45 | 11.99 1001 | Long Weight (blue) | 14.75 | 11.99 1001 | Left handed screwdriver | 25.99 | 11.99 1001 | Right handed screwdriver | 25.99 | 11.99 1002 | Sledge Hammer | 33.49 | 11.99 1003 | Straw Dog Box | 55.99 | 11.99 1003 | Chainsaw | 245.00 | 11.99
In this case we’re getting the value from the third row of the productprice
column. That’s because I passed productprice
as the first argument, and 3
as the second.
You may have noticed that the first two rows are null
in the nth_value
column. That’s because, by default, the window frame ends at the current row. Therefore, the window frame changes as each row is added to the results.
The reason is that the default frame is RANGE UNBOUNDED PRECEDING
, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
When we use an ORDER BY
clause, this sets the frame to be all rows from the partition start up through the current row’s last ORDER BY
peer. However, if we omit the ORDER BY
clause, all rows of the partition are included in the window frame, because all rows become peers of the current row.
Removing the ORDER BY
Clause
As mentioned, omitting the ORDER BY
clause causes all rows to become peers. Therefore, if we remove the ORDER BY
clause from the above example, all rows are considered peers of the current row, and the first two rows are no longer null
:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER ( )
FROM products;
Result:
vendorid | productname | productprice | nth_value ----------+---------------------------------+--------------+----------- 1001 | Left handed screwdriver | 25.99 | 14.75 1001 | Right handed screwdriver | 25.99 | 14.75 1001 | Long Weight (blue) | 14.75 | 14.75 1001 | Long Weight (green) | 11.99 | 14.75 1002 | Sledge Hammer | 33.49 | 14.75 1003 | Chainsaw | 245.00 | 14.75 1003 | Straw Dog Box | 55.99 | 14.75 1003 | Hammock | 10.00 | 14.75 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | 14.75 1004 | Tea Pot | 12.45 | 14.75
Changing the ORDER BY
Clause
Let’s reinstate the ORDER BY
clause, but explicitly sort by the vendorid
column:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER ( ORDER BY vendorid )
FROM products;
Result:
vendorid | productname | productprice | nth_value ----------+---------------------------------+--------------+----------- 1001 | Left handed screwdriver | 25.99 | 14.75 1001 | Right handed screwdriver | 25.99 | 14.75 1001 | Long Weight (blue) | 14.75 | 14.75 1001 | Long Weight (green) | 11.99 | 14.75 1002 | Sledge Hammer | 33.49 | 14.75 1003 | Chainsaw | 245.00 | 14.75 1003 | Straw Dog Box | 55.99 | 14.75 1003 | Hammock | 10.00 | 14.75 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | 14.75 1004 | Tea Pot | 12.45 | 14.75
In this case we got the same result as the unordered query. But that’s only because the unordered query happened to be in the same order.
Let’s change it to descending order:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER ( ORDER BY vendorid DESC )
FROM products;
Result:
vendorid | productname | productprice | nth_value ----------+---------------------------------+--------------+----------- 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | null 1004 | Tea Pot | 12.45 | null 1003 | Chainsaw | 245.00 | 245.00 1003 | Straw Dog Box | 55.99 | 245.00 1003 | Hammock | 10.00 | 245.00 1002 | Sledge Hammer | 33.49 | 245.00 1001 | Left handed screwdriver | 25.99 | 245.00 1001 | Long Weight (blue) | 14.75 | 245.00 1001 | Right handed screwdriver | 25.99 | 245.00 1001 | Long Weight (green) | 11.99 | 245.00
This time we get a different result, and the first two rows are again null
.
Change the Window Frame
As mentioned, the default frame is RANGE UNBOUNDED PRECEDING
, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
Here’s the same query from the first example, but this time it uses a frame clause to change the window frame:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER (
ORDER BY productprice
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
FROM products;
Result:
vendorid | productname | productprice | nth_value ----------+---------------------------------+--------------+----------- 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | 11.99 1003 | Hammock | 10.00 | 11.99 1001 | Long Weight (green) | 11.99 | 11.99 1004 | Tea Pot | 12.45 | 11.99 1001 | Long Weight (blue) | 14.75 | 11.99 1001 | Left handed screwdriver | 25.99 | 11.99 1001 | Right handed screwdriver | 25.99 | 11.99 1002 | Sledge Hammer | 33.49 | 11.99 1003 | Straw Dog Box | 55.99 | 11.99 1003 | Chainsaw | 245.00 | 11.99
This time we get the last value of the whole result set. That’s because I specified ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
as the frame clause. It’s the UNBOUNDED FOLLOWING
part that changed the result.
Just to be clear, here it is with and without the frame clause:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 ) OVER (
ORDER BY productprice
) AS "Frame 1",
nth_value( productprice, 3 ) OVER (
ORDER BY productprice
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS "Frame 2"
FROM products;
Result:
vendorid | productname | productprice | Frame 1 | Frame 2 ----------+---------------------------------+--------------+---------+--------- 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | null | 11.99 1003 | Hammock | 10.00 | null | 11.99 1001 | Long Weight (green) | 11.99 | 11.99 | 11.99 1004 | Tea Pot | 12.45 | 11.99 | 11.99 1001 | Long Weight (blue) | 14.75 | 11.99 | 11.99 1001 | Left handed screwdriver | 25.99 | 11.99 | 11.99 1001 | Right handed screwdriver | 25.99 | 11.99 | 11.99 1002 | Sledge Hammer | 33.49 | 11.99 | 11.99 1003 | Straw Dog Box | 55.99 | 11.99 | 11.99 1003 | Chainsaw | 245.00 | 11.99 | 11.99
Partition the Results
Let’s add a PARTITION BY
clause to partition by vendor:
SELECT
v.vendorname,
p.productname,
p.productprice,
nth_value( p.productprice, 2 ) OVER (
PARTITION BY p.vendorid
ORDER BY p.productprice
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
FROM products p INNER JOIN
vendors v ON p.vendorid = v.vendorid;
Result:
vendorname | productname | productprice | nth_value ---------------+---------------------------------+--------------+----------- Mars Supplies | Long Weight (green) | 11.99 | 14.75 Mars Supplies | Long Weight (blue) | 14.75 | 14.75 Mars Supplies | Right handed screwdriver | 25.99 | 14.75 Mars Supplies | Left handed screwdriver | 25.99 | 14.75 Randy Roofers | Sledge Hammer | 33.49 | null Pedal Medals | Hammock | 10.00 | 55.99 Pedal Medals | Straw Dog Box | 55.99 | 55.99 Pedal Medals | Chainsaw | 245.00 | 55.99 Katty Kittens | Bottomless Coffee Mugs (4 Pack) | 9.99 | 12.45 Katty Kittens | Tea Pot | 12.45 | 12.45
We can see that the nth_value
column resets with each new partition. I included UNBOUNDED FOLLOWING
in the frame clause so that, for any given row, the current window frame encompasses all rows in the partition.
We can see that the null
value is in the partition that only has one row. That’s because there’s no second row for which to get a value. The other partitions have at least two rows, and so we get a non-null
value for those rows.
I also ran a SQL join between the products
table and the vendors
table in order to get each vendor’s name instead of just the ID.
Removing the OVER
Clause
The OVER
clause is required when using nonaggregate window functions such as nth_value()
. Removing it results in an error:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3 )
FROM products;
Result:
ERROR: window function nth_value requires an OVER clause LINE 5: nth_value( productprice, 3 ) ^
Omitting an Argument
Omitting any of the arguments also causes an error:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice ) OVER ( ORDER BY productprice )
FROM products;
Result:
ERROR: function nth_value(numeric) does not exist LINE 5: nth_value( productprice ) OVER ( ORDER BY productprice ) ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
And:
SELECT
vendorid,
productname,
productprice,
nth_value( 3 ) OVER ( ORDER BY productprice )
FROM products;
Result:
ERROR: function nth_value(integer) does not exist LINE 5: nth_value( 3 ) OVER ( ORDER BY productprice ) ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
So we need to pass two arguments.
Wrong Argument Type
The second argument must be an integer type. Here’s what happens when we pass the wrong type:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 3.5 ) OVER ( ORDER BY productprice )
FROM products;
Result:
ERROR: function nth_value(numeric, numeric) does not exist LINE 5: nth_value( productprice, 3.5 ) OVER ( ORDER BY productpr... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts.
And here’s another one:
SELECT
vendorid,
productname,
productprice,
nth_value( productprice, 'three' ) OVER ( ORDER BY productprice )
FROM products;
Result:
ERROR: invalid input syntax for type integer: "three" LINE 5: nth_value( productprice, 'three' ) OVER ( ORDER BY produ... ^