Remove Duplicates from Count() Results in SQLite

When using the count() function in SQLite, you might find yourself in the situation where you only want to count distinct values. That is, you don’t want duplicate values to be counted multiple times.

In other words, if the column has the same value multiple times, it should only count that value once.

By default, count() will include all duplicate values in its calculation. For example, if the value “Cat” appears 3 times, count() will count that as 3.

However, if you only want distinct values to be counted, then count() would count it as 1.

Fortunately, there’s an easy way to do this. All you need to do is add the DISTINCT keyword to your count() function. Like this:

count(DISTINCT x)

Where x is the column name for which contents you’re counting (or the whole row if you’re using the asterisk wildcard).

Example

Take the following table:

ProductId   ProductName    Price     
----------  -------------  ----------
1           Widget Holder  139.5     
2           Blue Widget    10.0      
3           Red Widget     10.0      
4           Green Widget   10.0      
5           Widget Stick   89.75     
6           Foo Cap        11.99     

Notice that rows 2 to 4 all have the same price (10.0).

If I do a normal count() on the Price column, it will count all six rows.

SELECT count(Price) FROM Products;

Result:

6

However, if I add the DISTINCT keyword, it will count those three rows as one.

SELECT count(DISTINCT Price) FROM Products;

Result:

4

Count Distinct Rows

The count() function accepts the asterisk wildcard (*), which means that it will count all rows.

However, you’ll probably get an error if you try to use DISTINCT when using the asterisk wildcard.

Here’s a table with duplicate rows.

Postal      State             
----------  ------------------
NSW         New South Wales   
NSW         New South Wales   
QLD         Queensland        
TAS         Tasmania          
SA          South Australia   
WA          Western Australia 
VIC         Victoria          

In theory, I should be able to use DISTINCT to count the “de-duped” rows in this table. However, that doesn’t seem possible.

SELECT count(DISTINCT *)
FROM States;

Result:

Error: near "*": syntax error

Instead, I need to specify a column name when using the DISTINCT keyword.

SELECT 
  count(State),
  count(DISTINCT State)
FROM States;

Result:

count(State)  count(DISTINCT State)
------------  ---------------------
7             6                    

Another way to do it would be to do something like this:

SELECT count(*) FROM (SELECT DISTINCT * FROM States);

Result:

6

This shouldn’t really be a problem though, as it’s generally not good database design to allow duplicate rows like this.