6 Ways to Select Duplicate Rows in Oracle

The following examples return duplicate rows from an Oracle Database table.

Sample Data

Suppose we have a table with the following data:

SELECT * FROM Pets;

Result:

PetId  PetName  PetType
-----  -------  -------
1      Wag      Dog    
1      Wag      Dog    
2      Scratch  Cat    
3      Tweet    Bird   
4      Bark     Dog    
4      Bark     Dog    
4      Bark     Dog    

The first two rows are duplicates, as are the last three rows. In this case the duplicate rows contain duplicate values across all columns, including the ID column.

Option 1

We can use the following query to see how many rows are duplicates:

SELECT 
    PetId,
    PetName,
    PetType,
    COUNT(*) AS "Count"
FROM Pets
GROUP BY 
    PetId,
    PetName,
    PetType
ORDER BY PetId;

Result:

PETID	PETNAME	PETTYPE	Count
1	Wag	Dog	2
2	Scratch	Cat	1
3	Tweet	Bird	1
4	Bark	Dog	3

We grouped the rows by all columns, and returned the row count of each group. Any row with a count greater than 1 is a duplicate.

We can order it by count in descending order, so that the rows with the most duplicates appear first:

SELECT 
    PetId,
    PetName,
    PetType,
    COUNT(*) AS "Count"
FROM Pets
GROUP BY 
    PetId,
    PetName,
    PetType
ORDER BY Count(*) DESC;

Result:

PETID	PETNAME	PETTYPE	Count
4	Bark	Dog	3
1	Wag	Dog	2
2	Scratch	Cat	1
3	Tweet	Bird	1

Option 2

If we only want the duplicate rows listed, we can use the the HAVING clause to return only rows with a count of greater than 1:

SELECT 
    PetId,
    PetName,
    PetType,
    COUNT(*) AS "Count"
FROM Pets
GROUP BY 
    PetId,
    PetName,
    PetType
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

Result:

PETID	PETNAME	PETTYPE	Count
4	Bark	Dog	3
1	Wag	Dog	2

Option 3

Another option is to use the ROW_NUMBER() window function:

SELECT 
    PetId,
    PetName,
    PetType,
    ROW_NUMBER() OVER ( 
        PARTITION BY PetId, PetName, PetType 
        ORDER BY PetId, PetName, PetType
        ) AS rn
FROM Pets;

Result:

PETID	PETNAME	PETTYPE	RN
1	Wag	Dog	1
1	Wag	Dog	2
2	Scratch	Cat	1
3	Tweet	Bird	1
4	Bark	Dog	1
4	Bark	Dog	2
4	Bark	Dog	3

The PARTITION BY clause divides the result set produced by the FROM clause into partitions to which the function is applied. When we specify partitions for the result set, each partition causes the numbering to start over again (i.e. the numbering will start at 1 for the first row in each partition).

Option 4

We can use the above query as a common table expression:

WITH cte AS 
    (
        SELECT 
            PetId,
            PetName,
            PetType,
            ROW_NUMBER() OVER ( 
                PARTITION BY PetId, PetName, PetType 
                ORDER BY PetId, PetName, PetType
                ) AS Row_Number
        FROM Pets
    )
SELECT * FROM cte WHERE Row_Number <> 1;

Result:

PETID	PETNAME	PETTYPE	ROW_NUMBER
1	Wag	Dog	2
4	Bark	Dog	2
4	Bark	Dog	3

This returns just the excess rows from the matching duplicates. So if there are two identical rows, it returns one of them. If there are three identical rows, it returns two, and so on.

Option 5

Given our table doesn’t contain a primary key column we can take advantage of Oracle’s rowid pseudocolumn:

SELECT * FROM Pets
WHERE EXISTS (
  SELECT 1 FROM Pets p2 
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
  AND Pets.rowid > p2.rowid
);

Result:

PETID	PETNAME	PETTYPE
1	Wag	Dog
4	Bark	Dog
4	Bark	Dog

The way this works is, each row in an Oracle database has a rowid pseudocolumn that returns the address of the row. The rowid is a unique identifier for rows in the table, and usually its value uniquely identifies a row in the database. However, it’s important to note that rows in different tables that are stored together in the same cluster can have the same rowid.

One benefit of the above example is that we can replace SELECT * with DELETE in order to de-dupe the table.

Option 6

And finally, here’a another option that uses the rowid pseudocolumn:

SELECT * FROM Pets
WHERE rowid > (
  SELECT MIN(rowid) FROM Pets p2  
  WHERE Pets.PetName = p2.PetName
  AND Pets.PetType = p2.PetType
);

Result:

PETID	PETNAME	PETTYPE
1	Wag	Dog
4	Bark	Dog
4	Bark	Dog

Same result as the previous example.

As with the previous example, we can replace SELECT * with DELETE in order to remove duplicate rows from the table.