A LEFT SEMI JOIN
is kind of a half-join. It returns any distinct values that are returned by both the query on the left and right sides of the query.
However, when using T-SQL in SQL Server, if you try to explicitly use LEFT SEMI JOIN
in your query, you’ll probably get the following error:
Msg 155, Level 15, State 1, Line 4
'SEMI' is not a recognized join option.
Fortunately, T-SQL includes the INTERSECT
operator, which allows us to perform a LEFT SEMI JOIN
.
When you use the INTERSECT
operator, it appears in the query execution plan as a LEFT SEMI JOIN
.
You can also construct a subquery that does the same thing.
Example
Imagine we have two tables; Cats
and Dogs
, and we then run the following query:
SELECT
CatName AS PetName
FROM Cats
INTERSECT
SELECT
DogName AS PetName
FROM Dogs;
This query uses the INTERSECT
operator, and it will therefore appear as a LEFT SEMI JOIN
in the execution plan.
It’s similar to doing the following:
SELECT
DISTINCT CatName
FROM Cats c
WHERE EXISTS (SELECT DogName FROM Dogs d
WHERE c.CatName = d.DogName);
Here’s the execution plan for that query:
If you use U-SQL with Azure Data Lake Analytics, you can use the SEMIJOIN
clause to do right and left semi joins. That is, you can use LEFT SEMIJOIN
or RIGHT SEMIJOIN
.