Understanding the INTERSECT Clause in MySQL

When combining multiple query blocks in MySQL, we can use the INTERSECT clause to return just those rows that are common to both query blocks.

It’s a bit like the UNION clause, except that it excludes rows that aren’t present in both queries.

Example

Suppose we have two tables:

SELECT * FROM t1;
SELECT * FROM t2;

Result:

+--------+-------+
| c1     | c2    |
+--------+-------+
| Black  | Dog   |
| White  | Dog   |
| Yellow | Cow   |
| Green  | Horse |
| Black  | Cat   |
| White  | Cat   |
| Blue   | Fish  |
+--------+-------+

+---------+-------+
| c1      | c2    |
+---------+-------+
| Black   | Dog   |
| White   | Dog   |
| Pink    | Cow   |
| Red     | Horse |
| Black   | Cat   |
| Crimson | Cat   |
| White   | Bird  |
+---------+-------+

We can see that there are some duplicate rows across both tables. And if we narrow it to just each column, there are even more duplicates.

Let’s use the INTERSECT clause against those tables:

SELECT * FROM t1
INTERSECT
SELECT * FROM t2;

Result:

+-------+------+
| c1    | c2   |
+-------+------+
| Black | Dog  |
| White | Dog  |
| Black | Cat  |
+-------+------+

This query returns all rows that are common across both tables.

Changing the Number of Columns

The result will depend on which columns are selected, because changing the columns will change the combinations that lead to identical rows.

Let’s select just a single column from each table to see how that affects the result:

SELECT c1 FROM t1
INTERSECT
SELECT c1 FROM t2;

Result:

+-------+
| c1    |
+-------+
| Black |
| White |
+-------+

This time only two rows are returned. That’s because the default behaviour for INTERSECT is to return distinct (unique) values.

The DISTINCT Keyword

We can use the DISTINCT keyword to explicitly specify that we only want distinct values returned. As mentioned, this is the default behaviour of INTERSECT, and so using this keyword is optional – using it will yield the same result as not using it.

In any case, we can achieve the same result as the above example by doing this:

SELECT c1 FROM t1
INTERSECT DISTINCT
SELECT c1 FROM t2;

Result:

+-------+
| c1    |
+-------+
| Black |
| White |
+-------+

The INTERSECT clause provides a more concise way of getting a result that we would otherwise need to include a lot more code in our query. Here’s an example of how we can get the same result without using the INTERSECT clause:

SELECT 
    DISTINCT c1
FROM t1
WHERE EXISTS (SELECT c1 FROM t2
WHERE t1.c1 = t2.c1);

Result:

+-------+
| c1    |
+-------+
| Black |
| White |
+-------+

The ALL Keyword

We can use the ALL keyword to return all rows – not just distinct rows:

SELECT c1 FROM t1
INTERSECT ALL
SELECT c1 FROM t2;

Result:

+-------+
| c1    |
+-------+
| Black |
| Black |
| White |
| White |
+-------+

So we can see that this can easily produce duplicate rows. Each of our tables has Black twice and White twice, and so using ALL resulted in two rows for each.

Using INTERSECT with the TABLE Statement

We can use the INTERSECT clause with the TABLE statement, just as we can use it with the SELECT statement:

TABLE t1 INTERSECT TABLE t2;

Result:

+-------+------+
| c1    | c2   |
+-------+------+
| Black | Dog  |
| White | Dog  |
| Black | Cat  |
+-------+------+

Using INTERSECT with the VALUES Statement

We can also use INTERSECT with the stand alone VALUES statement:

VALUES
    ROW( 'Black', 'Dog' ),
    ROW( 'White', 'Dog' ),
    ROW( 'Yellow', 'Cow' ),
    ROW( 'Green', 'Horse' ),
    ROW( 'Black', 'Cat' ),
    ROW( 'White', 'Cat' ),
    ROW( 'Blue', 'Fish' )
INTERSECT
VALUES 
    ROW( 'Black', 'Dog' ),
    ROW( 'White', 'Dog' ),
    ROW( 'Pink', 'Cow' ),
    ROW( 'Red', 'Horse' ),
    ROW( 'Black', 'Cat' ),
    ROW( 'Crimson', 'Cat' ),
    ROW( 'White', 'Bird' );

Result:

+----------+----------+
| column_0 | column_1 |
+----------+----------+
| Black    | Dog      |
| White    | Dog      |
| Black    | Cat      |
+----------+----------+