In SQL Server, the CUME_DIST()
function calculates and returns the cumulative distribution of a value within a group of values. This is the relative position of a specified value in a group of values.
Syntax
The syntax goes like this:
CUME_DIST( )
OVER ( [ partition_by_clause ] order_by_clause )
So it doesn’t require or accept any arguments, but it requires an OVER
clause with at least an ORDER BY
clause.
Example
Here’s an example of using the CUME_DIST()
function:
SELECT
VendorId,
ProductName,
ProductPrice,
CUME_DIST( ) OVER (
ORDER BY ProductPrice
) AS CUME_DIST
FROM Products;
Result:
VendorId ProductName ProductPrice CUME_DIST -------- ------------------------------- ------------ --------- 1004 Bottomless Coffee Mugs (4 Pack) 9.99 0.1 1003 Hammock 10 0.2 1001 Long Weight (green) 11.99 0.3 1004 Tea Pot 12.45 0.4 1001 Long Weight (blue) 14.75 0.5 1001 Left handed screwdriver 25.99 0.7 1001 Right handed screwdriver 25.99 0.7 1002 Sledge Hammer 33.49 0.8 1003 Straw Dog Box 55.99 0.9 1003 Chainsaw 245 1
Here, the CUME_DIST
column displays the cumulative distribution of the ProductPrice
column when sorted in ascending order.
We can see that the values range from 0.1
to 1
. We can also see that two products have the same price (the left and right handed screwdrivers), and so they share the same cumulative distribution.
Partitioning the Result Set
We can use a PARTITION BY
clause in order to partition the result set by a given column. For example, we can partition by vendor:
SELECT
v.VendorName,
p.ProductName,
p.ProductPrice,
CUME_DIST( ) OVER (
PARTITION BY p.VendorId
ORDER BY p.ProductPrice
) AS CUME_DIST
FROM Products p
INNER JOIN Vendors v
ON v.VendorId = p.VendorId;
Result:
VendorName ProductName ProductPrice CUME_DIST ------------- ------------------------------- ------------ ------------------ Mars Supplies Long Weight (green) 11.99 0.25 Mars Supplies Long Weight (blue) 14.75 0.5 Mars Supplies Left handed screwdriver 25.99 1 Mars Supplies Right handed screwdriver 25.99 1 Randy Roofers Sledge Hammer 33.49 1 Pedal Medals Hammock 10 0.3333333333333333 Pedal Medals Straw Dog Box 55.99 0.6666666666666666 Pedal Medals Chainsaw 245 1 Katty Kittens Bottomless Coffee Mugs (4 Pack) 9.99 0.5 Katty Kittens Tea Pot 12.45 1
Here we get the cumulative distribution within each partition. Therefore the CUME_DIST
column resets with each new partition.
The OVER
Clause is Required
We must provide the OVER
clause when using the CUME_DIST()
function. Here’s what happens if we remove it:
SELECT
VendorId,
ProductName,
ProductPrice,
CUME_DIST( ) AS CUME_DIST
FROM Products;
Result:
Error: The function 'CUME_DIST' must have an OVER clause.
The ORDER BY
Clause is Required
Furthermore, the OVER
clause must have at least an ORDER BY
clause. Here’s what happens when we include an OVER
clause without an ORDER BY
clause:
SELECT
VendorId,
ProductName,
ProductPrice,
CUME_DIST( ) OVER ( ) AS CUME_DIST
FROM Products;
Result:
Error: The function 'CUME_DIST' must have an OVER clause with ORDER BY.