2 Ways to Delete Duplicate Rows in MariaDB (Ignores Primary Key)

Here are examples of removing duplicate rows from a table in MariaDB when those rows have a primary key or unique identifier column.

The examples delete duplicate rows but keep one. So in the case of two identical rows, it deletes one of them and keeps the other.

Sample Data

Our examples use the following data:

SELECT * FROM Dogs;

Result:

+-------+-----------+----------+
| DogId | FirstName | LastName |
+-------+-----------+----------+
|     1 | Bark      | Smith    |
|     2 | Bark      | Smith    |
|     3 | Woof      | Jones    |
|     4 | Ruff      | Robinson |
|     5 | Wag       | Johnson  |
|     6 | Wag       | Johnson  |
|     7 | Wag       | Johnson  |
+-------+-----------+----------+

We can see that the first two rows are duplicates, as well as the last three rows.

The DogId column holds unique values (because it’s the table’s primary key), and therefore, strictly speaking, there are no duplicates. But in real life situations, you will often want to de-dupe tables that contain primary keys. Therefore, in this article we ignore the primary key, and we detect duplicate values across the remaining columns.

Option 1

Let’s start our first option by selecting all rows that will be deleted:

SELECT * FROM Dogs 
WHERE DogId IN (
    SELECT DogId FROM Dogs 
    EXCEPT SELECT MIN(DogId) FROM Dogs 
    GROUP BY FirstName, LastName
    );

Result:

+-------+-----------+----------+
| DogId | FirstName | LastName |
+-------+-----------+----------+
|     2 | Bark      | Smith    |
|     6 | Wag       | Johnson  |
|     7 | Wag       | Johnson  |
+-------+-----------+----------+

To remove those duplicate rows, we can switch the SELECT * to DELETE:

DELETE FROM Dogs 
WHERE DogId IN (
    SELECT DogId FROM Dogs 
    EXCEPT SELECT MIN(DogId) FROM Dogs 
    GROUP BY FirstName, LastName
    );

Result:

Query OK, 3 rows affected (0.017 sec)

And to verify the result, we can select all remaining rows in the table:

SELECT * FROM Dogs;

Result:

+-------+-----------+----------+
| DogId | FirstName | LastName |
+-------+-----------+----------+
|     1 | Bark      | Smith    |
|     3 | Woof      | Jones    |
|     4 | Ruff      | Robinson |
|     5 | Wag       | Johnson  |
+-------+-----------+----------+

We can alternatively use the MAX() function instead of the MIN() function to change which rows are deleted.

Option 2

In this example we’ll assume that the table has been restored to its original state (with the duplicates).

We can use the following query to check for duplicate rows:

SELECT * 
FROM Dogs d1, Dogs d2 
WHERE d1.FirstName = d2.FirstName 
AND d1.LastName = d2.LastName
AND d1.DogId <> d2.DogId 
AND d1.DogId = (
    SELECT MAX(DogId) 
    FROM Dogs d3 
    WHERE d3.FirstName = d1.FirstName 
    AND d3.LastName = d1.LastName
);

Result:

+-------+-----------+----------+-------+-----------+----------+
| DogId | FirstName | LastName | DogId | FirstName | LastName |
+-------+-----------+----------+-------+-----------+----------+
|     2 | Bark      | Smith    |     1 | Bark      | Smith    |
|     7 | Wag       | Johnson  |     5 | Wag       | Johnson  |
|     7 | Wag       | Johnson  |     6 | Wag       | Johnson  |
+-------+-----------+----------+-------+-----------+----------+

And we can modify that query to delete the duplicates:

DELETE FROM Dogs WHERE DogId IN (
    SELECT d2.DogId 
    FROM Dogs d1, Dogs d2 
    WHERE d1.FirstName = d2.FirstName 
    AND d1.LastName = d2.LastName 
    AND d1.DogId <> d2.DogId 
    AND d1.DogId=( 
        SELECT MIN(DogId) 
        FROM Dogs d3 
        WHERE d3.FirstName = d1.FirstName 
        AND d3.LastName = d1.LastName
    )
);

Result:

Query OK, 3 rows affected (0.075 sec)

The table has now been de-duped.

We can verify this by selecting all rows again:

SELECT * FROM Dogs;

Result:

+-------+-----------+----------+
| DogId | FirstName | LastName |
+-------+-----------+----------+
|     1 | Bark      | Smith    |
|     3 | Woof      | Jones    |
|     4 | Ruff      | Robinson |
|     5 | Wag       | Johnson  |
+-------+-----------+----------+

We can use MAX() instead of MIN() to delete the other rows from the duplicates if preferred.