When using the count()
function in SQLite, you might find yourself in the situation where you only want to count distinct values. That is, you don’t want duplicate values to be counted multiple times.
In other words, if the column has the same value multiple times, it should only count that value once.
By default, count()
will include all duplicate values in its calculation. For example, if the value “Cat” appears 3 times, count()
will count that as 3.
However, if you only want distinct values to be counted, then count()
would count it as 1.
Fortunately, there’s an easy way to do this. All you need to do is add the DISTINCT
keyword to your count()
function. Like this:
count(DISTINCT x)
Where x
is the column name for which contents you’re counting (or the whole row if you’re using the asterisk wildcard).
Example
Take the following table:
ProductId ProductName Price ---------- ------------- ---------- 1 Widget Holder 139.5 2 Blue Widget 10.0 3 Red Widget 10.0 4 Green Widget 10.0 5 Widget Stick 89.75 6 Foo Cap 11.99
Notice that rows 2 to 4 all have the same price (10.0).
If I do a normal count()
on the Price column, it will count all six rows.
SELECT count(Price) FROM Products;
Result:
6
However, if I add the DISTINCT
keyword, it will count those three rows as one.
SELECT count(DISTINCT Price) FROM Products;
Result:
4
Count Distinct Rows
The count()
function accepts the asterisk wildcard (*
), which means that it will count all rows.
However, you’ll probably get an error if you try to use DISTINCT
when using the asterisk wildcard.
Here’s a table with duplicate rows.
Postal State ---------- ------------------ NSW New South Wales NSW New South Wales QLD Queensland TAS Tasmania SA South Australia WA Western Australia VIC Victoria
In theory, I should be able to use DISTINCT
to count the “de-duped” rows in this table. However, that doesn’t seem possible.
SELECT count(DISTINCT *)
FROM States;
Result:
Error: near "*": syntax error
Instead, I need to specify a column name when using the DISTINCT
keyword.
SELECT
count(State),
count(DISTINCT State)
FROM States;
Result:
count(State) count(DISTINCT State) ------------ --------------------- 7 6
Another way to do it would be to do something like this:
SELECT count(*) FROM (SELECT DISTINCT * FROM States);
Result:
6
This shouldn’t really be a problem though, as it’s generally not good database design to allow duplicate rows like this.