Most of the major RDBMSs provide us with the ability to perform set operations.
A set operation is one that combines the results of multiple query blocks into one. For example, we could combine the result of two separate SELECT
statements. The results are displayed in a single result set as though we’d run a single query.
Types of Set Operations
There are generally three different set operations that we can perform. These are usually done with the following set operators:
UNION
: Combines all results from two query blocks into a single result.INTERSECT
: Combines only those rows which the results of two query blocks have in common.EXCEPT
: Returns all rows from the first query that are not present in the second query. Some RDBMSs (such as Oracle) call thisMINUS
.
By default, set operations only return distinct rows (i.e. they remove duplicates). But we can use the ALL
keyword to include duplicates if required.
We can also combine different set operations within a query. For example, we could use UNION
and INTERSECT
in the same query.
Example of UNION
Suppose we have the following two tables:
SELECT * FROM t1;
SELECT * FROM t2;
Result:
+--------+-------+ | c1 | c2 | +--------+-------+ | Black | Dog | | White | Dog | | Yellow | Cow | | Green | Horse | | Black | Cat | | White | Cat | | Blue | Fish | +--------+-------+ 7 rows in set (0.00 sec) +---------+-------+ | c1 | c2 | +---------+-------+ | Black | Dog | | White | Dog | | Pink | Cow | | Red | Horse | | Black | Cat | | Crimson | Cat | | White | Bird | +---------+-------+ 7 rows in set (0.00 sec)
Here’s how we can use UNION
to combine those two result sets into a single result set:
SELECT * FROM t1
UNION
SELECT * FROM t2;
Result:
+---------+-------+ | c1 | c2 | +---------+-------+ | Black | Dog | | White | Dog | | Yellow | Cow | | Green | Horse | | Black | Cat | | White | Cat | | Blue | Fish | | Pink | Cow | | Red | Horse | | Crimson | Cat | | White | Bird | +---------+-------+ 11 rows in set (0.00 sec)
We can see that only 11 rows are returned, despite there being 7 rows in each table. This is because duplicate rows are removed by default. They can be included by using the ALL
keyword. More on this later.
Most RDBMSs will use the column names from the first table when performing set operations, but we can change the column names with column aliases.
Example of INTERSECT
Here’s an example of INTERSECT
:
SELECT * FROM t1
INTERSECT
SELECT * FROM t2;
Result:
+-------+------+ | c1 | c2 | +-------+------+ | White | Dog | | Black | Cat | | Black | Dog | +-------+------+ 3 rows in set (0.00 sec)
This time only 3 rows were returned. That’s because INTERSECT
only returns rows that are in both tables. If a row is in one table but not the other, then it’s not returned.
Example of EXCEPT
Here’s an example of EXCEPT
:
SELECT * FROM t1
EXCEPT
SELECT * FROM t2;
Result:
+--------+-------+ | c1 | c2 | +--------+-------+ | Yellow | Cow | | Blue | Fish | | Green | Horse | | White | Cat | +--------+-------+ 4 rows in set (0.00 sec)
This time we only get the rows that are in the first table but not the second one. That’s how EXCEPT
works, and it’s how MINUS
works in Oracle.
The ALL
Modifier
We can use the ALL
modifier to all rows (including duplicates):
SELECT * FROM t1
UNION ALL
SELECT * FROM t2;
Result:
+---------+-------+ | c1 | c2 | +---------+-------+ | Black | Dog | | White | Dog | | Yellow | Cow | | Green | Horse | | Black | Cat | | White | Cat | | Blue | Fish | | Black | Dog | | White | Dog | | Pink | Cow | | Red | Horse | | Black | Cat | | Crimson | Cat | | White | Bird | +---------+-------+ 14 rows in set (0.01 sec)
This time all 14 rows are returned, including duplicate rows.
The DISTINCT
Modifier
As mentioned, the default behaviour of SQL set operations is to return distinct rows. We can also specify this explicitly with the DISTINCT
modifier if we want:
SELECT * FROM t1
UNION DISTINCT
SELECT * FROM t2;
Result:
+---------+-------+ | c1 | c2 | +---------+-------+ | Black | Dog | | White | Dog | | Yellow | Cow | | Green | Horse | | Black | Cat | | White | Cat | | Blue | Fish | | Pink | Cow | | Red | Horse | | Crimson | Cat | | White | Bird | +---------+-------+ 11 rows in set (0.00 sec)
We got the same result that we got in the first example. That’s because DISTINCT
is the default behaviour for SQL set operations.