In SQL Server, you can use the T-SQL INTERSECT
operator to return distinct rows that are output by both the left and right input queries.
Syntax
The syntax goes like this:
{ <query_specification> | ( <query_expression> ) }
{ INTERSECT }
{ <query_specification> | ( <query_expression> ) }
Actually, the Microsoft documentation includes the EXCEPT
operator in its definition, as the same syntax applies to INTERSECT
and EXCEPT
.
The Microsoft syntax goes like this:
{ <query_specification> | ( <query_expression> ) }
{ EXCEPT | INTERSECT }
{ <query_specification> | ( <query_expression> ) }
Example
Imagine you have two tables; Cats
and Dogs
.
Cats
+---------+-----------+ | CatId | CatName | |---------+-----------| | 1 | Brush | | 2 | Scarcat | | 3 | Flutter | | 4 | Flutter | +---------+-----------+
Dogs
+---------+-----------+ | DogId | DogName | |---------+-----------| | 1 | Yelp | | 2 | Woofer | | 3 | Brush | | 4 | Brush | +---------+-----------+
Notice that there is a cat called Brush and there are also two dogs called Brush.
We can use the INTERSECT
operator to return only the distinct rows that are output by both the left and right input queries.
In other words, it will return any values that appear in both tables, but it will only return one row (even if there are multiple rows). Therefore, in our case it will return Brush
.
Let’s do it.
SELECT
CatName AS PetName
FROM Cats
INTERSECT
SELECT
DogName AS PetName
FROM Dogs;
Result:
Brush
As mentioned, it returns distinct rows, so only one row is returned in this case.
This appears as a LEFT SEMI JOIN
in the execution plan.
It’s similar to doing the following:
SELECT
DISTINCT CatName
FROM Cats c
WHERE EXISTS (SELECT DogName FROM Dogs d
WHERE c.CatName = d.DogName);
Result:
Brush
Note that when using INTERSECT
, the number and the order of the columns must be the same in all queries. Also, the data types must be compatible. They don’t actually have to be the same, but they must be comparable through implicit conversion.
Also, when comparing column values for determining DISTINCT
rows, two NULL
values are considered equal.
If you intend to use INTERSECT
in distributed queries, note that it is only executed on the local server and not pushed to the linked server, and this could therefore affect performance.