If you’re using the REPLACE
statement to replace rows in MySQL, but it’s inserting duplicate rows instead, it could be due to one of the following reasons.
Omitting the PRIMARY KEY
or UNIQUE
Index Column
It could be that you’re not including the PRIMARY KEY
or UNIQUE
index column in the values to replace. In this case, the REPLACE
statement will insert a new row instead of replacing an existing one. That’s because it has no PRIMARY KEY
or UNIQUE
index for which to determine which row to replace.
Suppose we create a table and insert data like this:
CREATE TABLE Person
(
PersonId INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
FirstName VARCHAR(20),
LastName VARCHAR(20)
);
INSERT INTO Person( FirstName, LastName ) VALUES( 'Homer', 'Simpson' );
INSERT INTO Person( FirstName, LastName ) VALUES( 'Lisa', 'Simpson' );
INSERT INTO Person( FirstName, LastName ) VALUES( 'Bart', 'Simpson' );
SELECT * FROM Person;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Bart | Simpson | +----------+-----------+----------+ 3 rows in set (0.00 sec)
The PersonId
column is the primary key column, and it’s also an AUTO_INCREMENT
column, which means that if we don’t explicitly specify a value for this column it will automatically insert a value that automatically increments with each row.
Let’s do that with the REPLACE
statement. Let’s say we want to replace Bart Simpson
with Sideshow Bob
but we forget to specify a value for the primary key column:
REPLACE INTO Person( FirstName, LastName ) VALUES( 'Sideshow', 'Bob' );
SELECT * FROM Person;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Bart | Simpson | | 4 | Sideshow | Bob | +----------+-----------+----------+ 4 rows in set (0.00 sec)
The row was inserted as a whole new row instead of updating any existing rows.
To update Bart Simpson, we need to specify which PersonId
to update (because that’s the primary key column).
Let’s do that now:
REPLACE INTO Person( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Sideshow | Bob | | 4 | Sideshow | Bob | +----------+-----------+----------+ 4 rows in set (0.00 sec)
The correct row has been replaced. The fourth row still exists, because we didn’t delete it, but it could easily be delete or replaced with new values.
Not Having a PRIMARY KEY
or UNIQUE
Index Column
Another reason for REPLACE
inserting duplicate rows is that if the table doesn’t have a PRIMARY KEY
or UNIQUE
index column. In this case we can’t replace anything, because there’s no unique identifier to tell MySQL which row to replace.
Suppose we create the following table (without a PRIMARY KEY
or UNIQUE
index):
CREATE TABLE Person2
(
PersonId INT,
FirstName VARCHAR(20),
LastName VARCHAR(20)
);
INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 1, 'Homer', 'Simpson' );
INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 2, 'Lisa', 'Simpson' );
INSERT INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Bart', 'Simpson' );
SELECT * FROM Person2;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Bart | Simpson | +----------+-----------+----------+ 3 rows in set (0.00 sec)
Now let’s try to replace Bart Simpson with Sideshow Bob:
REPLACE INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person2;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Bart | Simpson | | 3 | Sideshow | Bob | +----------+-----------+----------+ 4 rows in set (0.00 sec)
This resulted in a new row being inserted, instead of the existing row being replaced. Bart wasn’t replaced at all. And the PersonId
column now has duplicate values.
That’s because the table doesn’t have a PRIMARY KEY
or UNIQUE
index to tell MySQL which row to replace. The REPLACE
statement inserts a new row in such cases.
If we run the statement again, we’ll get a truely duplicate row – including the PersonId
column:
REPLACE INTO Person2( PersonId, FirstName, LastName ) VALUES( 3, 'Sideshow', 'Bob' );
SELECT * FROM Person2;
Result:
+----------+-----------+----------+ | PersonId | FirstName | LastName | +----------+-----------+----------+ | 1 | Homer | Simpson | | 2 | Lisa | Simpson | | 3 | Bart | Simpson | | 3 | Sideshow | Bob | | 3 | Sideshow | Bob | +----------+-----------+----------+ 5 rows in set (0.00 sec)
Therefore to resolve this issue, we would need to make sure we’re dealing with a table that has a PRIMARY KEY
or UNIQUE
index.