6 Ways to Delete Duplicate Rows that have a Primary Key in Oracle

Here are some options for deleting duplicate rows from a table in Oracle Database when those rows have a primary key or unique identifier column.

In such cases, the primary key must be ignored when comparing duplicate rows (due to the fact that primary keys hold unique values).

Sample Data

Our examples use the following data:

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
2BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson
6WagJohnson
7WagJohnson

We can see that the first two rows are duplicates, as are the last three rows.

The DogId column holds unique values (because it’s the table’s primary key), but we’re ignoring that column when comparing duplicates. You may often find yourself needing to de-dupe tables that contain primary keys, and so the following examples can be used to do just that.

Option 1

Here’s our first option for de-duping the above table:

DELETE FROM Dogs 
WHERE DogId IN (
    SELECT DogId FROM Dogs 
    MINUS SELECT MIN(DogId) FROM Dogs 
    GROUP BY FirstName, LastName
    );

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson

The duplicates have been removed (but one row of each duplicate remains).

We can alternatively use the MAX() function instead of the MIN() function to change which rows are deleted.

Option 2

In this example (and the following examples) we’ll assume that the table has been restored to its original state (with the duplicates).

Here’s another example that de-dupes the table and then selects the remaining rows:

DELETE FROM Dogs WHERE DogId IN (
    SELECT d2.DogId 
    FROM Dogs d1, Dogs d2 
    WHERE d1.FirstName = d2.FirstName 
    AND d1.LastName = d2.LastName 
    AND d1.DogId <> d2.DogId 
    AND d1.DogId=( 
        SELECT MAX(DogId) 
        FROM Dogs d3 
        WHERE d3.FirstName = d1.FirstName 
        AND d3.LastName = d1.LastName
    )
);

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
2BarkSmith
3WoofJones
4RuffRobinson
7WagJohnson

Notice that I used the MAX() function instead of MIN() which I used in the previous example. We can see the affect this has on the de-duping operation. It deleted different rows from the table.

Option 3

Here’s an option that doesn’t require the use of MIN() or MAX():

DELETE FROM Dogs
WHERE EXISTS (
  SELECT 1 FROM Dogs d2 
  WHERE Dogs.FirstName = d2.FirstName
  AND Dogs.LastName = d2.LastName
  AND Dogs.DogId > d2.DogId
);

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson

Option 4

Here’s another option:

DELETE FROM Dogs
WHERE DogId > (
  SELECT MIN(DogId) FROM Dogs d2  
  WHERE Dogs.FirstName = d2.FirstName
  AND Dogs.LastName = d2.LastName
);

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson

Option 5

Each row in Oracle has a rowid pseudocolumn that returns the address of the row. The rowid is a unique identifier for rows in the table, and usually its value uniquely identifies a row in the database (although it’s important to note that rows in different tables that are stored together in the same cluster can have the same rowid).

We can therefore use the rowid in our query instead of the DogId column:

DELETE FROM Dogs
WHERE EXISTS (
  SELECT 1 FROM Dogs d2 
  WHERE Dogs.FirstName = d2.FirstName
  AND Dogs.LastName = d2.LastName
  AND Dogs.rowid > d2.rowid
);

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson

While this example may seem slightly superfluous, given we already have a primary key column, there may be instances where you prefer to use the rowid. The rowid can be useful if you can’t use the primary key column for some reason, or if the table doesn’t have a primary key. Also, Oracle’s documentation mentions that rowid values are the fastest way to access a single row.

Option 6

And here’s the other example, but with rowid instead of the primary key:

DELETE FROM Dogs
WHERE rowid > (
  SELECT MIN(rowid) FROM Dogs d2  
  WHERE Dogs.FirstName = d2.FirstName
  AND Dogs.LastName = d2.LastName
);

SELECT * FROM Dogs;

Result:

DOGIDFIRSTNAMELASTNAME
1BarkSmith
3WoofJones
4RuffRobinson
5WagJohnson