In PostgreSQL, we can use the percent_rank()
function to return the relative rank of each row, expressed as a percentage ranging from 0 to 1 inclusive.
Syntax
The syntax goes like this:
percent_rank() over_clause
The OVER
clause determines how the rowset is partitioned and ordered before the window function is applied.
Example
Here’s a basic example to demonstrate how it works:
SELECT
vendorid,
productname,
productprice,
percent_rank() OVER ( ORDER BY productprice )
FROM products;
Result:
vendorid | productname | productprice | percent_rank ----------+---------------------------------+--------------+-------------------- 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | 0 1003 | Hammock | 10.00 | 0.1111111111111111 1001 | Long Weight (green) | 11.99 | 0.2222222222222222 1004 | Tea Pot | 12.45 | 0.3333333333333333 1001 | Long Weight (blue) | 14.75 | 0.4444444444444444 1001 | Left handed screwdriver | 25.99 | 0.5555555555555556 1001 | Right handed screwdriver | 25.99 | 0.5555555555555556 1002 | Sledge Hammer | 33.49 | 0.7777777777777778 1003 | Straw Dog Box | 55.99 | 0.8888888888888888 1003 | Chainsaw | 245.00 | 1 (10 rows)
We can see that the percentage rank values range from 0 to 1 as mentioned. In this example, two rows have the same price. Therefore, they share the same percentage rank value. These are the Left handed screwdriver
and Right handed screwdriver
rows.
The ORDER BY
clause is important when we use the percent_rank()
function. If we omit it, the ordering is nondeterministic, and all rows are peers (which results in all rows getting the same relative rank):
SELECT
vendorid,
productname,
productprice,
percent_rank() OVER ( )
FROM products;
Result:
vendorid | productname | productprice | percent_rank ----------+---------------------------------+--------------+-------------- 1001 | Left handed screwdriver | 25.99 | 0 1001 | Right handed screwdriver | 25.99 | 0 1001 | Long Weight (blue) | 14.75 | 0 1001 | Long Weight (green) | 11.99 | 0 1002 | Sledge Hammer | 33.49 | 0 1003 | Chainsaw | 245.00 | 0 1003 | Straw Dog Box | 55.99 | 0 1003 | Hammock | 10.00 | 0 1004 | Bottomless Coffee Mugs (4 Pack) | 9.99 | 0 1004 | Tea Pot | 12.45 | 0 (10 rows)
So we can see that by omitting the ORDER BY
clause, all rows now have a percentage rank of zero, which is not very helpful.
We haven’t applied any partitioning yet, and so the percentage rank is applied across the whole result set.
Partitioning
This example uses a PARTITION BY
clause to partition the results by the vendor:
SELECT
v.vendorname,
p.productname,
p.productprice,
percent_rank() OVER (
PARTITION BY p.vendorid
ORDER BY p.productprice
)
FROM products p INNER JOIN
vendors v ON p.vendorid = v.vendorid;
Result:
vendorname | productname | productprice | percent_rank ---------------+---------------------------------+--------------+-------------------- Mars Supplies | Long Weight (green) | 11.99 | 0 Mars Supplies | Long Weight (blue) | 14.75 | 0.3333333333333333 Mars Supplies | Right handed screwdriver | 25.99 | 0.6666666666666666 Mars Supplies | Left handed screwdriver | 25.99 | 0.6666666666666666 Randy Roofers | Sledge Hammer | 33.49 | 0 Pedal Medals | Hammock | 10.00 | 0 Pedal Medals | Straw Dog Box | 55.99 | 0.5 Pedal Medals | Chainsaw | 245.00 | 1 Katty Kittens | Bottomless Coffee Mugs (4 Pack) | 9.99 | 0 Katty Kittens | Tea Pot | 12.45 | 1 (10 rows)
This time the percentage rank values reset with each partition.
In this case I performed a SQL join on the vendors
table in order to return the vendor name instead of the vendor ID.
Changing the Order
Here’s what happens when we change the order to descending order:
SELECT
v.vendorname,
p.productname,
p.productprice,
percent_rank() OVER (
PARTITION BY p.vendorid
ORDER BY p.productprice DESC
)
FROM products p INNER JOIN
vendors v ON p.vendorid = v.vendorid;
Result:
vendorname | productname | productprice | percent_rank ---------------+---------------------------------+--------------+-------------------- Mars Supplies | Left handed screwdriver | 25.99 | 0 Mars Supplies | Right handed screwdriver | 25.99 | 0 Mars Supplies | Long Weight (blue) | 14.75 | 0.6666666666666666 Mars Supplies | Long Weight (green) | 11.99 | 1 Randy Roofers | Sledge Hammer | 33.49 | 0 Pedal Medals | Chainsaw | 245.00 | 0 Pedal Medals | Straw Dog Box | 55.99 | 0.5 Pedal Medals | Hammock | 10.00 | 1 Katty Kittens | Tea Pot | 12.45 | 0 Katty Kittens | Bottomless Coffee Mugs (4 Pack) | 9.99 | 1 (10 rows)
The data within each partition is returned in a different order, and the rows are assigned their percentage ranks based on that order, which in some cases means that a product receives a different percentage rank than in the previous example. For instance, both screwdrivers now get a percentage rank of zero, whereas they were previously 0.6666666666666666.
The OVER
Clause is Required
We must provide an OVER
clause when using the percent_rank()
function. This determines how the rowset is partitioned and ordered before the window function is applied.
Omitting the OVER
clause results in an error:
SELECT
vendorid,
productname,
productprice,
percent_rank()
FROM products;
Result:
ERROR: window function percent_rank requires an OVER clause