You can use the T-SQL EXCEPT
operator in SQL Server to return distinct rows from the left input query that aren’t output by the right input query.
Syntax
The syntax goes like this:
{ <query_specification> | ( <query_expression> ) }
{ EXCEPT }
{ <query_specification> | ( <query_expression> ) }
Actually, the Microsoft documentation includes the INTERSECT
operator in its definition, as the same syntax applies to EXCEPT
and INTERSECT
.
The Microsoft syntax goes like this:
{ <query_specification> | ( <query_expression> ) }
{ EXCEPT | INTERSECT }
{ <query_specification> | ( <query_expression> ) }
Example
Imagine you have two tables; Cats
and Dogs
.
Cats
+---------+-----------+ | CatId | CatName | |---------+-----------| | 1 | Brush | | 2 | Scarcat | | 3 | Flutter | | 4 | Flutter | +---------+-----------+
Dogs
+---------+-----------+ | DogId | DogName | |---------+-----------| | 1 | Yelp | | 2 | Woofer | | 3 | Brush | | 4 | Brush | +---------+-----------+
We can use the EXCEPT
operator to return only the distinct rows from the left input query that aren’t output by the right input query.
Here’s an example.
SELECT CatName FROM Cats
EXCEPT
SELECT DogName FROM Dogs;
Result:
+-----------+ | CatName | |-----------| | Flutter | | Scarcat | +-----------+
So we only get values that appear in the Cats
table that don’t also appear in the Dogs
table. As mentioned, it returns distinct rows, so only one row is returned for Flutter
.
We can also switch it around and put the Dogs
table on the left and Cats
on the right.
SELECT DogName FROM Dogs
EXCEPT
SELECT CatName FROM Cats;
Result:
+-----------+ | DogName | |-----------| | Woofer | | Yelp | +-----------+
The EXCEPT
operator appears as a LEFT ANTI SEMI JOIN
in the execution plan.
So our first example is similar to doing the following:
SELECT
DISTINCT CatName
FROM Cats c
WHERE NOT EXISTS (SELECT DogName FROM Dogs d
WHERE c.CatName = d.DogName);
Result:
+-----------+ | CatName | |-----------| | Flutter | | Scarcat | +-----------+
Note that when using EXCEPT
, the number and the order of the columns must be the same in all queries. Also, the data types must be compatible. They don’t actually have to be the same, but they must be comparable through implicit conversion.
Also, when comparing column values for determining DISTINCT
rows, two NULL
values are considered equal.
If you intend to use EXCEPT
in distributed queries, note that it is only executed on the local server and not pushed to the linked server, and this could therefore affect performance.