2 Reasons Your MySQL REPLACE Statement Results in Duplicate Rows

If you’re using the REPLACE statement to replace rows in MySQL, but it’s inserting duplicate rows instead, it could be due to one of the following reasons.

Omitting the PRIMARY KEY or UNIQUE Index Column

It could be that you’re not including the PRIMARY KEY or UNIQUE index column in the values to replace. In this case, the REPLACE statement will insert a new row instead of replacing an existing one. That’s because it has no PRIMARY KEY or UNIQUE index for which to determine which row to replace.

Suppose we create a table and insert data like this:

CREATE TABLE Person
(
    PersonId INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
    FirstName VARCHAR(20),
    LastName VARCHAR(20)
);

INSERT INTO Person( FirstName, LastName ) VALUES( 'Homer', 'Simpson' );
INSERT INTO Person( FirstName, LastName ) VALUES( 'Lisa', 'Simpson' );
INSERT INTO Person( FirstName, LastName ) VALUES( 'Bart', 'Simpson' );

SELECT * FROM Person;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Bart      | Simpson  |
+----------+-----------+----------+
3 rows in set (0.00 sec)

The PersonId column is the primary key column, and it’s also an AUTO_INCREMENT column, which means that if we don’t explicitly specify a value for this column it will automatically insert a value that automatically increments with each row.

Let’s do that with the REPLACE statement. Let’s say we want to replace Bart Simpson with Sideshow Bob but we forget to specify a value for the primary key column:

REPLACE INTO Person( FirstName, LastName ) VALUES( 'Sideshow', 'Bob' );
SELECT * FROM Person;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Bart      | Simpson  |
|        4 | Sideshow  | Bob      |
+----------+-----------+----------+
4 rows in set (0.00 sec)

The row was inserted as a whole new row instead of updating any existing rows.

To update Bart Simpson, we need to specify which PersonId to update (because that’s the primary key column).

Let’s do that now:

REPLACE INTO Person( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Sideshow  | Bob      |
|        4 | Sideshow  | Bob      |
+----------+-----------+----------+
4 rows in set (0.00 sec)

The correct row has been replaced. The fourth row still exists, because we didn’t delete it, but it could easily be delete or replaced with new values.

Not Having a PRIMARY KEY or UNIQUE Index Column

Another reason for REPLACE inserting duplicate rows is that if the table doesn’t have a PRIMARY KEY or UNIQUE index column. In this case we can’t replace anything, because there’s no unique identifier to tell MySQL which row to replace.

Suppose we create the following table (without a PRIMARY KEY or UNIQUE index):

CREATE TABLE Person2
(
    PersonId INT,
    FirstName VARCHAR(20),
    LastName VARCHAR(20)
);

INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 1, 'Homer', 'Simpson' );
INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 2, 'Lisa', 'Simpson' );
INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Bart', 'Simpson' );

SELECT * FROM Person2;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Bart      | Simpson  |
+----------+-----------+----------+
3 rows in set (0.00 sec)

Now let’s try to replace Bart Simpson with Sideshow Bob:

REPLACE INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person2;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Bart      | Simpson  |
|        3 | Sideshow  | Bob      |
+----------+-----------+----------+
4 rows in set (0.00 sec)

This resulted in a new row being inserted, instead of the existing row being replaced. Bart wasn’t replaced at all. And the PersonId column now has duplicate values.

That’s because the table doesn’t have a PRIMARY KEY or UNIQUE index to tell MySQL which row to replace. The REPLACE statement inserts a new row in such cases.

If we run the statement again, we’ll get a truely duplicate row – including the PersonId column:

REPLACE INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person2;

Result:

+----------+-----------+----------+
| PersonId | FirstName | LastName |
+----------+-----------+----------+
|        1 | Homer     | Simpson  |
|        2 | Lisa      | Simpson  |
|        3 | Bart      | Simpson  |
|        3 | Sideshow  | Bob      |
|        3 | Sideshow  | Bob      |
+----------+-----------+----------+
5 rows in set (0.00 sec)

Therefore to resolve this issue, we would need to make sure we’re dealing with a table that has a PRIMARY KEY or UNIQUE index.