How to Delete Duplicate Rows that have a Primary Key in SQL

The following examples use SQL to delete duplicate rows that have a primary key or unique identifier column.

Specifically, the examples delete all duplicates except one. So for example, in the case of three identical rows, it deletes two of them and keeps one.

Sample Data

Suppose we have a table with the following data:

SELECT * FROM Dogs;

Result:

+---------+-------------+------------+
| DogId   | FirstName   | LastName   |
|---------+-------------+------------|
| 1       | Bark        | Smith      |
| 2       | Bark        | Smith      |
| 3       | Woof        | Jones      |
| 4       | Ruff        | Robinson   |
| 5       | Wag         | Johnson    |
| 6       | Wag         | Johnson    |
| 7       | Wag         | Johnson    |
+---------+-------------+------------+

The first two rows are duplicates and the last three rows are duplicates (not counting the first column – the primary key).

MariaDB, PostgreSQL, SQLite, & SQL Server

Here’s an example of deleting duplicates from the above table.

This example works in SQL Server, MariaDB, PostgreSQL, and SQLite.

DELETE FROM Dogs 
WHERE DogId IN (
    SELECT DogId FROM Dogs 
    EXCEPT SELECT MIN(DogId) FROM Dogs 
    GROUP BY FirstName, LastName
    );

SELECT * FROM Dogs;

Result:

+---------+-------------+------------+
| DogId   | FirstName   | LastName   |
|---------+-------------+------------|
| 1       | Bark        | Smith      |
| 3       | Woof        | Jones      |
| 4       | Ruff        | Robinson   |
| 5       | Wag         | Johnson    |
+---------+-------------+------------+

Oracle

The previous example almost works in Oracle. But in Oracle, we need to replace EXCEPT with MINUS:

DELETE FROM Dogs 
WHERE DogId IN (
    SELECT DogId FROM Dogs 
    MINUS SELECT MIN(DogId) FROM Dogs 
    GROUP BY FirstName, LastName
    );

SELECT * FROM Dogs;

Result:

+---------+-------------+------------+
| DogId   | FirstName   | LastName   |
|---------+-------------+------------|
| 1       | Bark        | Smith      |
| 3       | Woof        | Jones      |
| 4       | Ruff        | Robinson   |
| 5       | Wag         | Johnson    |
+---------+-------------+------------+

MySQL

MySQL doesn’t support the EXCEPT operator or the MINUS operator. Therefore we need to construct a different query altogether.

The following query gets the job done:

DELETE d1 FROM Dogs d1
INNER JOIN Dogs d2
WHERE
    d1.DogId > d2.DogId AND
    d1.FirstName = d2.FirstName AND
    d1.LastName = d2.LastName;

SELECT * FROM Dogs;

Result:

+---------+-------------+------------+
| DogId   | FirstName   | LastName   |
|---------+-------------+------------|
| 1       | Bark        | Smith      |
| 3       | Woof        | Jones      |
| 4       | Ruff        | Robinson   |
| 5       | Wag         | Johnson    |
+---------+-------------+------------+

Using a CTE to Delete Duplicates

See Using a CTE with a DELETE Statement in SQL Server to Remove Duplicates for a step-by-step guide to deleting duplicate rows using a CTE. This includes viewing the data to be deleted before deleting it.

Logging Deleted Rows

Some RDBMSs provide us with a handy clause that allows us to see or log all deleted rows. For example:

  • PostgreSQL has the RETURNING clause that returns a list of the rows that were deleted.
  • SQL Server has the OUTPUT clause that does the same thing, but also allows us to insert the deleted rows into another table or table variable.