2 Ways to Delete Duplicate Rows in SQLite

The following options can be used to delete duplicate rows in SQLite.

These examples delete duplicate rows but keep one. So if there are three identical rows for example, it deletes two of them and keeps one. This is often referred to as de-duping the table.

Sample Data

Suppose we have a table with the following data:

SELECT * FROM Pets;

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
1      Wag      Dog    
2      Scratch  Cat    
3      Tweet    Bird   
4      Bark     Dog    
4      Bark     Dog    
4      Bark     Dog    

We can see that the first two rows are duplicates, as are the last three rows.

In this case, all columns are duplicates. There is no primary key. The PetId column might look like it could be a primary key, but it actually contains duplicate values. Therefore, it’s not a unique identifier for each row, and it cannot be used as a primary key.

If there was a primary key, it would contain unique values across all rows, and there would be no duplicates.

In any case, below are two options for finding and deleting duplicate rows.

Option 1

Before we de-dupe the table, we can use the following query to see which row/s will be deleted:

SELECT * FROM Pets
WHERE EXISTS (
  SELECT 1 FROM Pets p2 
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
  AND Pets.rowid > p2.rowid
);

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
4      Bark     Dog    
4      Bark     Dog    

This shows us that three rows are going to be deleted when we de-dupe the table in the next step.

To delete the duplicate values, we can modify the above query by replacing SELECT * with DELETE:

DELETE FROM Pets
WHERE EXISTS (
  SELECT 1 FROM Pets p2 
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
  AND Pets.rowid > p2.rowid
);

SELECT * FROM Pets;

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
2      Scratch  Cat    
3      Tweet    Bird   
4      Bark     Dog    

The table has now been de-duped.

As expected, one of the duplicate rows for our dog “Wag” has been deleted and the other one remains. Two of the duplicate rows for “Bark” have also been deleted.

The reason we were able to do this is because of SQLite’s rowid. By default, every row in SQLite has a special column, usually called the rowid, that uniquely identifies that row within the table. Unless it has been explicitly removed from the table, you can use this as a unique identifier for each row, which enables us to construct the above queries. The same applies to the next example.

Option 2

Assuming the table has been restored with its original data (including duplicate rows), here’s another option for de-duping it.

Check which rows will be deleted:

SELECT * FROM Pets
WHERE rowid > (
  SELECT MIN(rowid) FROM Pets p2  
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
);

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
4      Bark     Dog    
4      Bark     Dog    

Now delete those rows:

DELETE FROM Pets
WHERE rowid > (
  SELECT MIN(rowid) FROM Pets p2  
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
);

SELECT * FROM Pets;

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
2      Scratch  Cat    
3      Tweet    Bird   
4      Bark     Dog    

The table has been de-duped.