In SQL Server, the APPROX_PERCENTILE_DISC()
function calculates and returns an approximate percentile based on a discrete distribution of the column values.
We pass the desired percentile to the function when we call it.
The APPROX_PERCENTILE_DISC()
function can be used in place of the PERCENTILE_DISC()
function if you just want to quickly get an approximate value. This could be useful on larger data sets, where using PERCENTILE_DISC()
to get an accurate percentile value could take a much longer time to execute.
The APPROX_PERCENTILE_DISC()
function was introduced in SQL Server 2022.
Syntax
The syntax goes like this:
APPROX_PERCENTILE_DISC (numeric_literal)
WITHIN GROUP (ORDER BY order_by_expression [ASC|DESC])
The numeric_literal
argument is in the range between 0.0
and 1.0
. This (required) value is the percentile to compute.
We must provide a WITHIN GROUP
clause with an ORDER BY
clause. Only one ORDER BY
expression is allowed.
Example
Suppose we have the following table:
SELECT * FROM Products;
Result:
ProductId VendorId ProductName ProductPrice ----------- ----------- ---------------------------------------- --------------------- 1 1001 Left handed screwdriver 25.9900 2 1001 Right handed screwdriver 25.9900 3 1001 Long Weight (blue) 14.7500 4 1001 Long Weight (green) 11.9900 5 1002 Sledge Hammer 33.4900 6 1003 Chainsaw 245.0000 7 1003 Straw Dog Box 55.9900 8 1003 Hammock 10.0000 9 1004 Bottomless Coffee Mugs (4 Pack) 9.9900 10 1004 Tea Pot 12.4500
We can use the APPROX_PERCENTILE_DISC()
function against that table like this:
SELECT
APPROX_PERCENTILE_DISC( 0.5 )
WITHIN GROUP ( ORDER BY ProductPrice ) AS APPROX_PERCENTILE_DISC
FROM Products;
Result:
APPROX_PERCENTILE_DISC ---------------------- 14.7500
In this example, we used an argument of 0.5
in order to calculate the median price of all products in the table, sorted by product price (thanks to the WITHIN GROUP
clause).
The APPROX_PERCENTILE_DISC()
function always returns a value from the underlying data set. This is in contrast to the APPROX_PERCENTILE_CONT()
function, which returns an interpolated value (which means that it isn’t necessarily in the actual data set).
Another Example
Here’s an example that uses different percentiles:
SELECT
APPROX_PERCENTILE_DISC( 0.3 )
WITHIN GROUP ( ORDER BY ProductPrice ) AS Result1,
APPROX_PERCENTILE_DISC( 0.7 )
WITHIN GROUP ( ORDER BY ProductPrice ) AS Result2,
APPROX_PERCENTILE_DISC( 0.9 )
WITHIN GROUP ( ORDER BY ProductPrice ) AS Result3
FROM Products;
Result:
Result1 Result2 Result3 --------------------- --------------------- --------------------- 11.9900 25.9900 55.9900
As expected, all three values are in the underlying data set.
The WITHIN GROUP
Clause is Required
We must provide the WITHIN GROUP
clause when using APPROX_PERCENTILE_DISC()
. Here’s what happens if we remove it:
SELECT
APPROX_PERCENTILE_DISC( 0.5 )
FROM Products;
Result:
Msg 10754, Level 15, State 2, Server c2cfd5de5ad7, Line 33 The function 'APPROX_PERCENTILE_DISC' must have a WITHIN GROUP clause.
About the ORDER BY
Clause
The WITHIN GROUP
clause must have an ORDER BY
sub-clause, and that ORDER BY
clause must have exactly one expression (no more, no less).
Here’s what happens when we include more than one ORDER BY
expression:
SELECT
APPROX_PERCENTILE_DISC( 0.5 )
WITHIN GROUP ( ORDER BY ProductPrice, ProductId )
FROM Products;
Result:
Msg 10751, Level 15, State 2, Server 1f674fe22ff6, Line 33 The ORDER BY in WITHIN GROUP clause of 'APPROX_PERCENTILE_DISC' function must have exactly one expression.
The error message is self explanatory – we can only have one expression in the ORDER BY
clause in the WITHIN GROUP
clause.
Here’s what happens if we remove the ORDER BY
clause altogether:
SELECT
APPROX_PERCENTILE_DISC( 0.5 )
WITHIN GROUP ( )
FROM Products;
Result:
Msg 102, Level 15, State 1, Server c2cfd5de5ad7, Line 34 Incorrect syntax near ')'.
It tells us we have a syntax error. This is because the WITHIN GROUP
clause doesn’t contain an ORDER BY
sub-clause.
Documentation
For more information and examples, see Microsoft’s documentation for the APPROX_PERCENTILE_DISC()
function.