An Introduction to the APPROX_PERCENTILE_DISC() Function in SQL Server

In SQL Server, the APPROX_PERCENTILE_DISC() function calculates and returns an approximate percentile based on a discrete distribution of the column values.

We pass the desired percentile to the function when we call it.

The APPROX_PERCENTILE_DISC() function can be used in place of the PERCENTILE_DISC() function if you just want to quickly get an approximate value. This could be useful on larger data sets, where using PERCENTILE_DISC() to get an accurate percentile value could take a much longer time to execute.

The APPROX_PERCENTILE_DISC() function was introduced in SQL Server 2022.

Syntax

The syntax goes like this:

APPROX_PERCENTILE_DISC (numeric_literal)  
WITHIN GROUP (ORDER BY order_by_expression [ASC|DESC])

The numeric_literal argument is in the range between 0.0 and 1.0. This (required) value is the percentile to compute.

We must provide a WITHIN GROUP clause with an ORDER BY clause. Only one ORDER BY expression is allowed.

Example

Suppose we have the following table:

SELECT * FROM Products;

Result:

ProductId   VendorId    ProductName                              ProductPrice         
----------- ----------- ---------------------------------------- ---------------------
          1        1001 Left handed screwdriver                                25.9900
          2        1001 Right handed screwdriver                               25.9900
          3        1001 Long Weight (blue)                                     14.7500
          4        1001 Long Weight (green)                                    11.9900
          5        1002 Sledge Hammer                                          33.4900
          6        1003 Chainsaw                                              245.0000
          7        1003 Straw Dog Box                                          55.9900
          8        1003 Hammock                                                10.0000
          9        1004 Bottomless Coffee Mugs (4 Pack)                         9.9900
         10        1004 Tea Pot                                                12.4500

We can use the APPROX_PERCENTILE_DISC() function against that table like this:

SELECT
    APPROX_PERCENTILE_DISC( 0.5 ) 
        WITHIN GROUP ( ORDER BY ProductPrice ) AS APPROX_PERCENTILE_DISC
FROM Products;

Result:

APPROX_PERCENTILE_DISC
----------------------
               14.7500

In this example, we used an argument of 0.5 in order to calculate the median price of all products in the table, sorted by product price (thanks to the WITHIN GROUP clause).

The APPROX_PERCENTILE_DISC() function always returns a value from the underlying data set. This is in contrast to the APPROX_PERCENTILE_CONT() function, which returns an interpolated value (which means that it isn’t necessarily in the actual data set).

Another Example

Here’s an example that uses different percentiles:

SELECT
    APPROX_PERCENTILE_DISC( 0.3 ) 
        WITHIN GROUP ( ORDER BY ProductPrice ) AS Result1,
    APPROX_PERCENTILE_DISC( 0.7 ) 
        WITHIN GROUP ( ORDER BY ProductPrice ) AS Result2,
    APPROX_PERCENTILE_DISC( 0.9 ) 
        WITHIN GROUP ( ORDER BY ProductPrice ) AS Result3
FROM Products;

Result:

Result1               Result2               Result3              
--------------------- --------------------- ---------------------
              11.9900               25.9900               55.9900

As expected, all three values are in the underlying data set.

The WITHIN GROUP Clause is Required

We must provide the WITHIN GROUP clause when using APPROX_PERCENTILE_DISC(). Here’s what happens if we remove it:

SELECT
    APPROX_PERCENTILE_DISC( 0.5 )
FROM Products;

Result:

Msg 10754, Level 15, State 2, Server c2cfd5de5ad7, Line 33
The function 'APPROX_PERCENTILE_DISC' must have a WITHIN GROUP clause.

About the ORDER BY Clause

The WITHIN GROUP clause must have an ORDER BY sub-clause, and that ORDER BY clause must have exactly one expression (no more, no less).

Here’s what happens when we include more than one ORDER BY expression:

SELECT
    APPROX_PERCENTILE_DISC( 0.5 ) 
        WITHIN GROUP ( ORDER BY ProductPrice, ProductId )
FROM Products;

Result:

Msg 10751, Level 15, State 2, Server 1f674fe22ff6, Line 33
The ORDER BY in WITHIN GROUP clause of 'APPROX_PERCENTILE_DISC' function must have exactly one expression.

The error message is self explanatory – we can only have one expression in the ORDER BY clause in the WITHIN GROUP clause.

Here’s what happens if we remove the ORDER BY clause altogether:

SELECT
    APPROX_PERCENTILE_DISC( 0.5 ) 
        WITHIN GROUP ( )
FROM Products;

Result:

Msg 102, Level 15, State 1, Server c2cfd5de5ad7, Line 34
Incorrect syntax near ')'.

It tells us we have a syntax error. This is because the WITHIN GROUP clause doesn’t contain an ORDER BY sub-clause.

Documentation

For more information and examples, see Microsoft’s documentation for the APPROX_PERCENTILE_DISC() function.