The MERGE
statement in SQL Server allows us to perform INSERT
, UPDATE
, and DELETE
operations in a single query. This makes it an efficient way to synchronize two tables, typically between a source and a target, based on a defined condition. Rather than running separate queries to handle insertions, updates, and deletions, we can combine all of these operations into one statement; the MERGE
statement.
Structure of the MERGE
Statement
The MERGE
statement follows a specific structure, which includes defining the target and source tables, specifying a matching condition, and then defining actions (INSERT
, UPDATE
, DELETE
) based on that condition.
Here’s the basic syntax of the MERGE
statement:
MERGE INTO target_table AS target
USING source_table AS source
ON target.key_column = source.key_column
WHEN MATCHED THEN
UPDATE SET target.column1 = source.column1, target.column2 = source.column2
WHEN NOT MATCHED BY TARGET THEN
INSERT (column1, column2) VALUES (source.column1, source.column2)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Explanation
MERGE INTO target_table
: Specifies the target table you want to synchronize with the source.USING source_table
: Specifies the source table that provides data to compare with the target table.ON
: Specifies the matching condition between the target and source tables (usually based on a key column).WHEN MATCHED
: Executes an action when the condition evaluates to true (i.e., when a row exists in both the source and target).WHEN NOT MATCHED BY TARGET
: Executes an action when a row exists in the source but not in the target (i.e., insert a new row).WHEN NOT MATCHED BY SOURCE
: Executes an action when a row exists in the target but not in the source (i.e., delete a row).
We also have the option of appending the OUTPUT
clause in order to return the rows affected by the MERGE
statement. There are examples of this later in the article.
Example 1: Basic MERGE
to Synchronize Data
Here’s our first example to demonstrate the MERGE
statement. This example demonstrates the basic clauses outlined in the syntax outlined above.
Scenario
We have two tables: Customers
(target) and NewCustomers
(source). The Customers
table holds existing customer data, while the NewCustomers
table contains updated customer data. We need to update existing records, insert new customers, and delete customers who no longer exist in the NewCustomers
table.
Step 1: Create the Tables
-- Create the Customers table (Target)
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Email NVARCHAR(100)
);
-- Create the NewCustomers table (Source)
CREATE TABLE NewCustomers (
CustomerID INT PRIMARY KEY,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Email NVARCHAR(100)
);
Step 2: Populate the Tables with Data
-- Insert initial data into Customers
INSERT INTO Customers (CustomerID, FirstName, LastName, Email)
VALUES
(1, 'Rohit', 'Rooney', '[email protected]'),
(2, 'Selina', 'Abel', '[email protected]'),
(3, 'Nelly', 'Nguyen', '[email protected]');
-- Insert data into NewCustomers
INSERT INTO NewCustomers (CustomerID, FirstName, LastName, Email)
VALUES
(1, 'Rohit', 'Rooney', '[email protected]'), -- Existing customer, no change
(2, 'Selina', 'Bradey', '[email protected]'), -- Existing customer, email change
(4, 'Fitch', 'Foley', '[email protected]'); -- New customer
Step 3: Execute the MERGE
Statement
This is the part we’ve been waiting for…
MERGE INTO Customers AS target
USING NewCustomers AS source
ON target.CustomerID = source.CustomerID
WHEN MATCHED THEN
UPDATE SET
target.FirstName = source.FirstName,
target.LastName = source.LastName,
target.Email = source.Email
WHEN NOT MATCHED BY TARGET THEN
INSERT (CustomerID, FirstName, LastName, Email)
VALUES (source.CustomerID, source.FirstName, source.LastName, source.Email)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Notice the three distinct sections: WHEN MATCHED
, WHEN NOT MATCHED BY TARGET
, and WHEN NOT MATCHED BY SOURCE
.
These sections determine what is to happen in the different scenarios. We don’t have to provide all of these if we don’t want. For example, we could omit the last one if we don’t want to delete any rows.
Step 4: Verify the Results
Now, let’s select everything from the Customers
table to see the result of the merge:
SELECT * FROM Customers;
Result:
CustomerID FirstName LastName Email
---------- --------- -------- --------------------------
1 Rohit Rooney [email protected]
2 Selina Bradey [email protected]
4 Fitch Foley [email protected]
Explanation
- Customer 1 has no change, because the data is identical.
- Customer 2 is updated with new last name and email.
- Customer 3 is deleted because this customer is not in the
NewCustomers
table. - Customer 4 is inserted as a new customer.
Example 2: MERGE
with OUTPUT
Clause
The OUTPUT
clause can be used with MERGE
to return information about what was inserted, updated, or deleted. This can be useful for logging or auditing purposes.
When we use the OUTPUT
clause, we have the option of returning the results to the client when we run the MERGE
statement, or saving the results to a table or table variable.
Let’s save them to a table.
Step 1: Create the MergeLog
Table
We want to track the changes made by the MERGE
operation and log them into a separate MergeLog
table. So the following table is where the result of the OUTPUT
clause will be inserted and stored:
CREATE TABLE MergeLog (
Operation NVARCHAR(10),
CustomerID INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
Email NVARCHAR(100)
);
Step 2: Modify the MERGE
Statement to Use OUTPUT
All we need to do is append the OUTPUT
clause to our previous code example (the OUTPUT
clause goes at the end of the MERGE
statement):
MERGE INTO Customers AS target
USING NewCustomers AS source
ON target.CustomerID = source.CustomerID
WHEN MATCHED THEN
UPDATE SET
target.FirstName = source.FirstName,
target.LastName = source.LastName,
target.Email = source.Email
WHEN NOT MATCHED BY TARGET THEN
INSERT (CustomerID, FirstName, LastName, Email)
VALUES (source.CustomerID, source.FirstName, source.LastName, source.Email)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT
$action AS Operation,
inserted.CustomerID,
inserted.FirstName,
inserted.LastName,
inserted.Email
INTO MergeLog (Operation, CustomerID, FirstName, LastName, Email);
The $action
part is a special keyword that can only be used with MERGE
statements. This is used to output the operation type that occurred for each row (INSERT
, UPDATE
, or DELETE
).
The other part of the clause specifies which columns we want to return (or be stored in the table). We prefix these with either inserted
or deleted
, depending on which one we want. The inserted
prefix returns the new value and deleted
returns the old value (that was deleted, as part of either a DELETE
or an UPDATE
operation).
Step 3: Review the MergeLog
Table
The good part about saving the results to a table is that the data persists in the database. So we can run a SELECT
statement later if required.
Let’s check the log table now:
SELECT * FROM MergeLog;
Result:
Operation CustomerID FirstName LastName Email
--------- ---------- --------- -------- --------------------------
UPDATE 1 Rohit Rooney [email protected]
UPDATE 2 Selina Bradey [email protected]
DELETE null null null null
INSERT 4 Fitch Foley [email protected]
Example 3: A Product Catalog
Let’s take a scenario where a product catalog needs to be updated with prices from a supplier’s new price list. The source contains the updated product prices and categories, while the target (existing catalog) needs to reflect these updates.
Step 1: Create Product Tables
-- Create the ProductCatalog table (Target)
CREATE TABLE ProductCatalog (
ProductID INT PRIMARY KEY,
ProductName NVARCHAR(100),
Category NVARCHAR(50),
Price DECIMAL(10, 2)
);
-- Create the SupplierPriceList table (Source)
CREATE TABLE SupplierPriceList (
ProductID INT PRIMARY KEY,
ProductName NVARCHAR(100),
Category NVARCHAR(50),
Price DECIMAL(10, 2)
);
Step 2: Populate Tables with Data
-- Insert initial data into ProductCatalog
INSERT INTO ProductCatalog (ProductID, ProductName, Category, Price)
VALUES
(1, 'Laptop', 'Electronics', 999.99),
(2, 'Smartphone', 'Electronics', 499.99),
(3, 'Desk Chair', 'Furniture', 129.99),
(4, 'Washing Machine', 'Appliances', 799.99);
-- Insert updated data into SupplierPriceList
INSERT INTO SupplierPriceList (ProductID, ProductName, Category, Price)
VALUES
(1, 'Laptop', 'Electronics', 949.99), -- Price reduction
(2, 'Smartphone', 'Electronics', 479.99), -- Price reduction
(3, 'Desk Chair', 'Furniture', 139.99), -- Price increase
(5, 'Microwave Oven', 'Appliances', 199.99); -- New product
Step 3: Use MERGE
to Update Prices
MERGE INTO ProductCatalog AS target
USING SupplierPriceList AS source
ON target.ProductID = source.ProductID
WHEN MATCHED THEN
UPDATE SET
target.Price = source.Price,
target.Category = source.Category
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, ProductName, Category, Price)
VALUES (source.ProductID, source.ProductName, source.Category, source.Price)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT
$action AS Operation,
inserted.*,
deleted.*;
Output:
Operation ProductID ProductName Category Price
--------- --------- --------------------- ----------------------- -------------
UPDATE 1,1 Laptop,Laptop Electronics,Electronics 949.99,999.99
UPDATE 2,2 Smartphone,Smartphone Electronics,Electronics 479.99,499.99
UPDATE 3,3 Desk Chair,Desk Chair Furniture,Furniture 139.99,129.99
DELETE 4 Washing Machine Appliances 799.99
INSERT 5, Microwave Oven, Appliances, 199.99,
Explanation
- Products with IDs 1, 2, and 3 are updated because they exist in both tables but with different prices.
- Product with ID 4 (Washing Machine) is deleted from the
ProductCatalog
because it no longer exists in the supplier’s price list. - Product with ID 5 is inserted because it exists in the
SupplierPriceList
but not in theProductCatalog
. - The
OUTPUT
clause resulted in the old and new values sharing a column. That’s because I didn’t use an alias to distinguish between the deleted/inserted columns (in fact, I simply used the asterisk wildcard (*
) to select all columns). So the result is that the inserted value comes first, then the deleted value, separated by a comma.
Verify the Results
SELECT * FROM ProductCatalog;
Result:
ProductID ProductName Category Price
--------- -------------- ----------- ------
1 Laptop Electronics 949.99
2 Smartphone Electronics 479.99
3 Desk Chair Furniture 139.99
5 Microwave Oven Appliances 199.99
Example 4: Handling Soft Deletes
In some applications, deleting records may not be desirable. Instead, you might want to “soft delete” records by marking them as inactive. In this scenario, we’ll modify the MERGE
statement to mark records as inactive in the target table when they don’t exist in the source.
Step 1: Modify ProductCatalog Table
We’ll add a Status
column to the ProductCatalog
table to track whether a product is active or inactive.
-- Add Status column to ProductCatalog
ALTER TABLE ProductCatalog ADD Status NVARCHAR(10) DEFAULT 'Active';
Step 2: Re-Insert Original Data
Let’s remove the existing data and re-insert the original data:
-- Remove all data from both tables
TRUNCATE TABLE SupplierPriceList;
TRUNCATE TABLE ProductCatalog;
-- Insert initial data into ProductCatalog
INSERT INTO ProductCatalog (ProductID, ProductName, Category, Price)
VALUES
(1, 'Laptop', 'Electronics', 999.99),
(2, 'Smartphone', 'Electronics', 499.99),
(3, 'Desk Chair', 'Furniture', 129.99),
(4, 'Washing Machine', 'Appliances', 799.99);
-- Insert updated data into SupplierPriceList
INSERT INTO SupplierPriceList (ProductID, ProductName, Category, Price)
VALUES
(1, 'Laptop', 'Electronics', 949.99), -- Price reduction
(2, 'Smartphone', 'Electronics', 479.99), -- Price reduction
(3, 'Desk Chair', 'Furniture', 139.99), -- Price increase
(5, 'Microwave Oven', 'Appliances', 199.99); -- New product
Step 3: Modify MERGE
to Handle Soft Deletes
Now we’ll run the same MERGE
statement, but this time, for the row that needs to be deleted, it will do a soft delete instead of an actual delete:
MERGE INTO ProductCatalog AS target
USING SupplierPriceList AS source
ON target.ProductID = source.ProductID
WHEN MATCHED THEN
UPDATE SET
target.Price = source.Price,
target.Category = source.Category,
target.Status = 'Active'
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, ProductName, Category, Price, Status)
VALUES (source.ProductID, source.ProductName, source.Category, source.Price, 'Active')
WHEN NOT MATCHED BY SOURCE THEN
UPDATE SET target.Status = 'Inactive'
OUTPUT
$action AS Operation,
inserted.*,
deleted.*;
Output:
Operation ProductID ProductName Category Price Status
--------- --------- ------------------------------- ----------------------- ------------- ---------------
UPDATE 1,1 Laptop,Laptop Electronics,Electronics 949.99,999.99 Active,Active
UPDATE 2,2 Smartphone,Smartphone Electronics,Electronics 479.99,499.99 Active,Active
UPDATE 3,3 Desk Chair,Desk Chair Furniture,Furniture 139.99,129.99 Active,Active
UPDATE 4,4 Washing Machine,Washing Machine Appliances,Appliances 799.99,799.99 Inactive,Active
INSERT 5, Microwave Oven, Appliances, 199.99, Active,
Explanation
- Products that exist in both the
SupplierPriceList
and theProductCatalog
are updated and marked asActive
. - New products are inserted and marked as
Active
. - Products that don’t exist in the
SupplierPriceList
but are in theProductCatalog
are soft deleted by marking theirStatus
asInactive
.
Verify the Results
SELECT * FROM ProductCatalog;
The ProductCatalog
will now include a Status
column indicating which products are active or inactive:
ProductID ProductName Category Price Status
--------- --------------- ----------- ------ --------
1 Laptop Electronics 949.99 Active
2 Smartphone Electronics 479.99 Active
3 Desk Chair Furniture 139.99 Active
4 Washing Machine Appliances 799.99 Inactive
5 Microwave Oven Appliances 199.99 Active
Use Cases for MERGE
Here are some of the typical situations where MERGE
can be useful:
- Data synchronization: Merge data from one table to another by comparing existing records and updating, inserting, or deleting records accordingly.
- ETL processes: Merge is particularly useful in Extract, Transform, Load (ETL) operations where you need to keep tables in sync.
- Change data capture: When you need to track changes between two datasets (such as master data and transactional data).
Performance Considerations
The MERGE
statement can simplify your SQL queries, but there are some performance considerations:
- Indexing: Ensure that the columns used in the
ON
clause are indexed. This will improve the performance of theMERGE
operation, particularly when comparing large datasets. Indexes help speed up the search for matching rows between the source and target tables. - Locks:
MERGE
may hold locks longer than separateINSERT
,UPDATE
, andDELETE
statements. This is because the query is performing multiple operations in one execution. Be aware of potential lock contention, especially in highly concurrent environments. Consider adding theHOLDLOCK
hint. Microsoft states “In some scenarios where unique keys are expected to be both inserted and updated by theMERGE
, specifying theHOLDLOCK
will prevent against unique key violations.“. - Data Volume: For very large datasets,
MERGE
can sometimes be slower than handlingINSERT
,UPDATE
, andDELETE
operations individually. This is due to the complexity of combining these operations into one. For such cases, it might be better to analyze and break down the process. - Triggers: If your target table has triggers (e.g.,
AFTER INSERT
,AFTER UPDATE
,AFTER DELETE
), those triggers will be fired appropriately for each type of operation (although there’s no guarantee on which action to fire triggers first or last). This can be an advantage but might introduce performance penalties if triggers are complex or perform additional data processing. More about merging with triggers on the Microsoft website.
Considerations and Best Practices
Here are a few things to consider when thinking about using the MERGE
statement:
- Avoid Complex Logic: Keep the logic inside the
MERGE
simple. If theWHEN
clauses become too complicated, consider breaking the logic into separate statements. - Test with Small Data: Test the
MERGE
statement on a small dataset first before running it on large tables. This helps ensure the logic behaves as expected. - Performance Monitoring: As mentioned earlier, ensure that you monitor the performance of the
MERGE
statement, especially in production environments where large datasets and concurrent transactions are involved. - Limitations & Issues with
MERGE
: See Aaron Bertrand’s article on mssqltips.com about various bugs and issues with theMERGE
statement, concurrency issues, things to be aware of with triggers, and more. It’s an old article mind you, so some of the issues may have since been fixed. But it pays to check whether yourMERGE
statements and triggers are operating correctly.