SQL Set Operations Explained

Most of the major RDBMSs provide us with the ability to perform set operations.

A set operation is one that combines the results of multiple query blocks into one. For example, we could combine the result of two separate SELECT statements. The results are displayed in a single result set as though we’d run a single query.

Types of Set Operations

There are generally three different set operations that we can perform. These are usually done with the following set operators:

  • UNION: Combines all results from two query blocks into a single result.
  • INTERSECT: Combines only those rows which the results of two query blocks have in common.
  • EXCEPT: Returns all rows from the first query that are not present in the second query. Some RDBMSs (such as Oracle) call this MINUS.

By default, set operations only return distinct rows (i.e. they remove duplicates). But we can use the ALL keyword to include duplicates if required.

We can also combine different set operations within a query. For example, we could use UNION and INTERSECT in the same query.

Example of UNION

Suppose we have the following two tables:

SELECT * FROM t1;
SELECT * FROM t2;

Result:

+--------+-------+
| c1     | c2    |
+--------+-------+
| Black  | Dog   |
| White  | Dog   |
| Yellow | Cow   |
| Green  | Horse |
| Black  | Cat   |
| White  | Cat   |
| Blue   | Fish  |
+--------+-------+
7 rows in set (0.00 sec)

+---------+-------+
| c1      | c2    |
+---------+-------+
| Black   | Dog   |
| White   | Dog   |
| Pink    | Cow   |
| Red     | Horse |
| Black   | Cat   |
| Crimson | Cat   |
| White   | Bird  |
+---------+-------+
7 rows in set (0.00 sec)

Here’s how we can use UNION to combine those two result sets into a single result set:

SELECT * FROM t1
UNION
SELECT * FROM t2;

Result:

+---------+-------+
| c1      | c2    |
+---------+-------+
| Black   | Dog   |
| White   | Dog   |
| Yellow  | Cow   |
| Green   | Horse |
| Black   | Cat   |
| White   | Cat   |
| Blue    | Fish  |
| Pink    | Cow   |
| Red     | Horse |
| Crimson | Cat   |
| White   | Bird  |
+---------+-------+
11 rows in set (0.00 sec)

We can see that only 11 rows are returned, despite there being 7 rows in each table. This is because duplicate rows are removed by default. They can be included by using the ALL keyword. More on this later.

Most RDBMSs will use the column names from the first table when performing set operations, but we can change the column names with column aliases.

Example of INTERSECT

Here’s an example of INTERSECT:

SELECT * FROM t1
INTERSECT
SELECT * FROM t2;

Result:

+-------+------+
| c1    | c2   |
+-------+------+
| White | Dog  |
| Black | Cat  |
| Black | Dog  |
+-------+------+
3 rows in set (0.00 sec)

This time only 3 rows were returned. That’s because INTERSECT only returns rows that are in both tables. If a row is in one table but not the other, then it’s not returned.

Example of EXCEPT

Here’s an example of EXCEPT:

SELECT * FROM t1
EXCEPT
SELECT * FROM t2;

Result:

+--------+-------+
| c1     | c2    |
+--------+-------+
| Yellow | Cow   |
| Blue   | Fish  |
| Green  | Horse |
| White  | Cat   |
+--------+-------+
4 rows in set (0.00 sec)

This time we only get the rows that are in the first table but not the second one. That’s how EXCEPT works, and it’s how MINUS works in Oracle.

The ALL Modifier

We can use the ALL modifier to all rows (including duplicates):

SELECT * FROM t1
UNION ALL
SELECT * FROM t2;

Result:

+---------+-------+
| c1      | c2    |
+---------+-------+
| Black   | Dog   |
| White   | Dog   |
| Yellow  | Cow   |
| Green   | Horse |
| Black   | Cat   |
| White   | Cat   |
| Blue    | Fish  |
| Black   | Dog   |
| White   | Dog   |
| Pink    | Cow   |
| Red     | Horse |
| Black   | Cat   |
| Crimson | Cat   |
| White   | Bird  |
+---------+-------+
14 rows in set (0.01 sec)

This time all 14 rows are returned, including duplicate rows.

The DISTINCT Modifier

As mentioned, the default behaviour of SQL set operations is to return distinct rows. We can also specify this explicitly with the DISTINCT modifier if we want:

SELECT * FROM t1
UNION DISTINCT
SELECT * FROM t2;

Result:

+---------+-------+
| c1      | c2    |
+---------+-------+
| Black   | Dog   |
| White   | Dog   |
| Yellow  | Cow   |
| Green   | Horse |
| Black   | Cat   |
| White   | Cat   |
| Blue    | Fish  |
| Pink    | Cow   |
| Red     | Horse |
| Crimson | Cat   |
| White   | Bird  |
+---------+-------+
11 rows in set (0.00 sec)

We got the same result that we got in the first example. That’s because DISTINCT is the default behaviour for SQL set operations.