The following example deletes duplicate rows in MySQL while ignoring the primary key or unique identifier column.
The example deletes duplicate rows but keeps one. So in the case of two identical rows, it deletes one of them and keeps the other.
Sample Data
Suppose we have a table with the following data:
SELECT * FROM Dogs;
Result:
+---------+-------------+------------+ | DogId | FirstName | LastName | |---------+-------------+------------| | 1 | Bark | Smith | | 2 | Bark | Smith | | 3 | Woof | Jones | | 4 | Ruff | Robinson | | 5 | Wag | Johnson | | 6 | Wag | Johnson | | 7 | Wag | Johnson | +---------+-------------+------------+
We can see that the first two rows are duplicates, and the last three rows are duplicates.
Find the Duplicates
First, let’s select check our table to see how many rows are duplicates:
SELECT
FirstName,
LastName,
COUNT(*) AS Count
FROM Dogs
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1;
Result:
+-----------+----------+-------+ | FirstName | LastName | Count | +-----------+----------+-------+ | Bark | Smith | 2 | | Wag | Johnson | 3 | +-----------+----------+-------+
We can see that there are two rows with Bark Smith, and three rows with Wag Johnson.
We will de-dupe the table so that it contains just one of each.
Delete the Duplicates
Running the following code de-dupes the above table:
DELETE d1 FROM Dogs d1
INNER JOIN Dogs d2
WHERE
d1.DogId < d2.DogId AND
d1.FirstName = d2.FirstName AND
d1.LastName = d2.LastName;
Result:
Query OK, 3 rows affected (0.00 sec)
Let’s view the result:
SELECT * FROM Dogs;
Result:
+-------+-----------+----------+ | DogId | FirstName | LastName | +-------+-----------+----------+ | 2 | Bark | Smith | | 3 | Woof | Jones | | 4 | Ruff | Robinson | | 7 | Wag | Johnson | +-------+-----------+----------+
We have successfully removed duplicate rows from the table.