APPROX_COUNT_DISTINCT()
is one of the new functions introduced in SQL Server 2019. This function returns the approximate number of unique non-null values in a group.
Basically, you can use it to get an approximate idea of the number of non-duplicate rows in a large table or result set. It works similar to the COUNT_BIG()
and COUNT()
functions (when using the DISTINCT
clause), but it returns an approximate number rather than a precise number.
APPROX_COUNT_DISTINCT()
is aimed mainly at big data scenarios. It’s designed for accessing large data sets with more than a million rows, and aggregation of a column or columns that have many distinct values. It is intended for scenarios where responsiveness is more critical than absolute precision.
Microsoft states that the function implementation guarantees up to a 2% error rate within a 97% probability.
At the time of writing, APPROX_COUNT_DISTINCT()
is a public preview feature. It was introduced in SQL Server 2019, which is also currently in preview status.
Note that Microsoft states that preview features are not intended for production use.