Understanding the PERCENTILE_CONT() Function in SQL Server

In SQL Server, PERCENTILE_CONT() is a window function that calculates a percentile based on a continuous distribution of the column value.

When we call the function, we specify the percentile to use. It then performs its calculation based on that percentile.

Syntax

The syntax goes like this:

PERCENTILE_CONT ( numeric_literal )   
    WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] )  
    OVER ( [ <partition_by_clause> ] )

So numeric_literal is in the range between 0.0 and 1.0. This (required) value is the percentile to compute.

We must provide a WITHIN GROUP clause with an ORDER BY clause.

We must also provide an OVER clause.

Example

Here’s a basic example to demonstrate:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.5 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice
        )
        OVER ( 
            PARTITION BY VendorId
        ) AS PERCENTILE_CONT
FROM Products;

Result:

VendorId  ProductName                      ProductPrice  PERCENTILE_CONT   
--------  -------------------------------  ------------  ------------------
1001      Long Weight (green)              11.99         20.369999999999997
1001      Long Weight (blue)               14.75         20.369999999999997
1001      Left handed screwdriver          25.99         20.369999999999997
1001      Right handed screwdriver         25.99         20.369999999999997
1002      Sledge Hammer                    33.49         33.49             
1003      Hammock                          10            55.99             
1003      Straw Dog Box                    55.99         55.99             
1003      Chainsaw                         245           55.99             
1004      Bottomless Coffee Mugs (4 Pack)  9.99          11.219999999999999
1004      Tea Pot                          12.45         11.219999999999999

In this example, we used an argument of 0.5 in order to calculate the median price of all products from each vendor. We used the WITHIN GROUP clause to sort the set by product price, and we used the OVER clause with a PARTITION BY clause to partition the results by vendor.

The PERCENTILE_CONT() function doesn’t necessarily return a value from the actual data set. We can see an example of this with the first vendor. In that case, the PERCENTILE_CONT column contains 20.369999999999997, but that value isn’t actually in the data set. Same with the last vendor. Its PERCENTILE_CONT column contains 11.219999999999999, but that value isn’t in the underlying data.

SQL Server also has a PERCENTILE_DISC() function that works in a similar way to PERCENTILE_CONT(), except that it always returns an actual value from the data set.

Another Example

Here’s an example that uses a different percentile:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.8 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice
        )
        OVER ( 
            PARTITION BY VendorId
        ) AS PERCENTILE_CONT
FROM Products;

Result:

VendorId  ProductName                      ProductPrice  PERCENTILE_CONT   
--------  -------------------------------  ------------  ------------------
1001      Long Weight (green)              11.99         25.99             
1001      Long Weight (blue)               14.75         25.99             
1001      Left handed screwdriver          25.99         25.99             
1001      Right handed screwdriver         25.99         25.99             
1002      Sledge Hammer                    33.49         33.49             
1003      Hammock                          10            169.39600000000002
1003      Straw Dog Box                    55.99         169.39600000000002
1003      Chainsaw                         245           169.39600000000002
1004      Bottomless Coffee Mugs (4 Pack)  9.99          11.958            
1004      Tea Pot                          12.45         11.958            

The only difference between this example and the previous one is that I used a percentile of 0.8 instead of 0.5.

We can see how this has drastically changed the output for vendor 1003, but not so much for the others, if at all.

The OVER Clause is Required

As mentioned, we must provide the OVER clause when using PERCENTILE_CONT(). Here’s what happens if we remove it:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.8 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice
        ) AS PERCENTILE_CONT
FROM Products;

Result:

Msg 10753, Level 15, State 3, Line 5
The function 'PERCENTILE_CONT' must have an OVER clause.

About the ORDER BY Clause

The ORDER BY clause cannot be in the OVER clause when using the PERCENTILE_CONT() function. This makes it (and the PERCENTILE_DISC() function) a bit different to most other window functions, where we must have an ORDER BY clause in the OVER clause.

That said, we can see from the above example that the ORDER BY clause goes into the WITHIN GROUP clause when using PERCENTILE_CONT().

Here’s what happens if we put an ORDER BY clause in the OVER clause:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.8 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice
        )
        OVER ( 
            PARTITION BY VendorId
            ORDER BY ProductPrice
        ) AS PERCENTILE_CONT
FROM Products;

Result:

Msg 10758, Level 15, State 1, Line 5
The function 'PERCENTILE_CONT' may not have ORDER BY in OVER clause.

Another thing about the ORDER BY clause when using PERCENTILE_CONT() is that only one ORDER BY expression is allowed.

Here’s what happens when we provide more than one ORDER BY expression:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.5 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice, ProductName
        )
        OVER ( 
            PARTITION BY VendorId
        ) AS PERCENTILE_CONT
FROM Products;

Result:

Msg 10751, Level 15, State 1, Line 5
The ORDER BY in WITHIN GROUP clause of 'PERCENTILE_CONT' function must have exactly one expression.

The error tells us that the function must only have one expression.

No Window Frame Allowed

Another thing we can’t do when using the PERCENTILE_CONT() function is specify a window frame. By this I mean we can’t use the ROWS or RANGE clause like we can in other window functions.

Here’s what happens when we try to specify a window frame:

SELECT
    VendorId,
    ProductName,
    ProductPrice,
    PERCENTILE_CONT( 0.8 ) 
        WITHIN GROUP (
            ORDER BY ProductPrice
        )
        OVER ( 
            PARTITION BY VendorId
            RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
        ) AS PERCENTILE_CONT
FROM Products;

Result:

Msg 10752, Level 15, State 1, Line 5
The function 'PERCENTILE_CONT' may not have a window frame.

Documentation

For more information and examples, see Microsoft’s documentation for the PERCENTILE_CONT() function.