Tweak your Avg() Results in SQLite with the DISTINCT Keyword

If you know about the avg() function in SQLite, you’re probably aware that it returns the average of all non-NULL X within a group.

But did you know you can add the DISTINCT keyword to this function?

If you add the DISTINCT keyword, avg() will calculate its results based on distinct values only. This is essentially the same as removing duplicate values and then calculating the average on the remaining values.

Syntax

To use the DISTINCT keyword, you simply insert it as the first argument.

Like this:

avg(DISTINCT X)

Where X is the column name for which you’re calculating the average.

Example

Take the following table called Products:

ProductId   ProductName    Price     
----------  -------------  ----------
1           Widget Holder  139.5     
2           Blue Widget    10.0      
3           Red Widget     10.0      
4           Green Widget   10.0      
5           Widget Stick   89.75     
6           Foo Cap        11.99

If I run a normal avg() on the Price column:

SELECT avg(Price) FROM Products;

Here’s what I get:

45.2066666666667

But if I run a DISTINCT query:

SELECT avg(DISTINCT Price) FROM Products;

I get this:

62.81

So in this case it changes the result quite considerably.

Just to be clear, here they are side by side:

SELECT 
 avg(Price) AS "Non-Distinct",
 avg(DISTINCT Price) AS "Distinct"
FROM Products;

Result:

Non-Distinct      Distinct  
----------------  ----------
45.2066666666667  62.81

As you can probably imagine, using the DISTINCT keyword with avg() could skew the results enormously, especially if there are a lot of duplicates at one end of the range, but not many duplicates at the other end.