The following example uses T-SQL to delete duplicate rows in SQL Server.
To be more specific, it deletes duplicate rows but keeps one. So if you have two identical rows, it deletes one of them and keeps the other. In other words, it de-dupes the table.
Sample Data
Suppose we have a table with the following data:
SELECT * FROM Pets;
Result:
+---------+-----------+-----------+ | PetId | PetName | PetType | |---------+-----------+-----------| | 1 | Wag | Dog | | 1 | Wag | Dog | | 2 | Scratch | Cat | | 3 | Tweet | Bird | | 4 | Bark | Dog | | 4 | Bark | Dog | | 4 | Bark | Dog | +---------+-----------+-----------+
We can see that the first two rows are duplicates, as are the last three rows.
Select Duplicates
Before we de-dupe the table, we can use the following query to see which row/s will be deleted:
WITH CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY PetId, PetName, PetType
ORDER BY PetId, PetName, PetType
) AS Row_Number
FROM Pets
)
SELECT * FROM CTE WHERE Row_Number <> 1;
Result:
+---------+-----------+-----------+--------------+ | PetId | PetName | PetType | Row_Number | |---------+-----------+-----------+--------------| | 1 | Wag | Dog | 2 | | 4 | Bark | Dog | 2 | | 4 | Bark | Dog | 3 | +---------+-----------+-----------+--------------+
Delete Duplicates
To delete duplicate values, we can modify the above query by replacing SELECT *
on the last line with DELETE
:
WITH CTE AS
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY PetId, PetName, PetType
ORDER BY PetId, PetName, PetType
) AS Row_Number
FROM Pets
)
DELETE FROM CTE WHERE Row_Number <> 1;
Result:
(3 rows affected)
The table has now been de-duped.
We can verify this by selecting all rows again:
SELECT * FROM Pets;
Result:
+---------+-----------+-----------+ | PetId | PetName | PetType | |---------+-----------+-----------| | 1 | Wag | Dog | | 2 | Scratch | Cat | | 3 | Tweet | Bird | | 4 | Bark | Dog | +---------+-----------+-----------+
As expected, one of the duplicate rows for our dog “Wag” has been deleted and the other one remains. Two of the duplicate rows for “Bark” have also been deleted. The table has been successfully de-duped.