The SQL UNION
operator concatenates the results of two queries into a single result set. By default it returns distinct rows (i.e. it removes any redundant duplicate rows from the result set). But we can also use UNION ALL
to return non-distinct rows (i.e. retain duplicates).
When it comes to NULL values, it’s pretty straight forward. SQL treats two NULL values as non distinct values. In other words, they’re duplicates.
Example
Here’s a quick example to demonstrate:
SELECT NULL
UNION
SELECT NULL
UNION
SELECT NULL;
Result:
+------------+ | NULL | +------------+ | NULL | +------------+ 1 row in set (0.00 sec)
So three NULL
s were reduced to just one NULL
.
This is the same as doing the following:
SELECT NULL
UNION DISTINCT
SELECT NULL
UNION DISTINCT
SELECT NULL;
Result:
+------------+ | NULL | +------------+ | NULL | +------------+ 1 row in set (0.00 sec)
This will depend on your RDBMS though, the DISTINCT
keyword with the UNION
operator is not widely supported.
We can use the ALL
keyword to include all duplicates, and that includes duplicate NULL
s:
SELECT NULL
UNION ALL
SELECT NULL
UNION ALL
SELECT NULL;
Result:
+------------+ | NULL | +------------+ | NULL | | NULL | | NULL | +------------+ 3 rows in set (0.01 sec)
A Database Example
Here are examples that use data from a database.
Sample Data
The following examples use these tables:
Cats: +-------+-----------+-----------+------+ | CatId | FirstName | LastName | Age | +-------+-----------+-----------+------+ | 1 | Fetch | Feather | 7 | | 2 | Meow | McCormack | NULL | | 3 | Fluffy | NULL | NULL | | 4 | Scratch | Stone | NULL | | 5 | Scratch | Head | 10 | +-------+-----------+-----------+------+ 5 rows in set (0.00 sec) Dogs: +-------+-----------+----------+------+ | DogId | FirstName | LastName | Age | +-------+-----------+----------+------+ | 1 | Bark | Smith | 7 | | 2 | Wag | Conner | NULL | | 3 | Bark | NULL | NULL | | 4 | Bark | Jones | NULL | | 5 | Fetch | Fraser | 10 | | 6 | Bite | Tohill | 10 | +-------+-----------+----------+------+ 6 rows in set (0.00 sec)
Both tables have the same column names, and we can see that both tables have NULL
values in their respective Age
columns in three rows.
Using UNION
Let’s run a query that uses the UNION
operator:
SELECT Age FROM Cats
UNION
SELECT Age FROM Dogs;
Result:
+------+ | Age | +------+ | 7 | | NULL | | 10 | +------+
We selected just one column – the Age
column. We only got one instance of NULL
even though there are multiple instances of NULL
in the Age
column.
Let’s add another column to our query:
SELECT FirstName, Age FROM Cats
UNION
SELECT FirstName, Age FROM Dogs;
Result:
+-----------+------+ | FirstName | Age | +-----------+------+ | Fetch | 7 | | Meow | NULL | | Fluffy | NULL | | Scratch | NULL | | Scratch | 10 | | Bark | 7 | | Wag | NULL | | Bark | NULL | | Fetch | 10 | | Bite | 10 | +-----------+------+ 10 rows in set (0.00 sec)
This time more rows are returned because the combined columns have increased their distinctness. However, one row has been eliminated, and so only ten rows are returned instead of eleven. The Dogs
table contains two 10 year old dogs named Bark. This resulted in duplicate rows (because we were only selecting the name and age of each dog). Therefore, one of those rows was removed from the results.
Using UNION DISTINCT
As mentioned, UNION
removes redundant duplicates. In some RDBMSs we can do this explicitly by adding the DISTINCT
keyword:
SELECT Age FROM Cats
UNION DISTINCT
SELECT Age FROM Dogs;
Result:
+------+ | Age | +------+ | 7 | | NULL | | 10 | +------+
Same result.
Using UNION ALL
If we change it to UNION ALL
, duplicates are retained:
SELECT Age FROM Cats
UNION ALL
SELECT Age FROM Dogs;
Result:
+------+ | Age | +------+ | 7 | | NULL | | NULL | | NULL | | 10 | | 7 | | NULL | | NULL | | NULL | | 10 | | 10 | +------+ 11 rows in set (0.00 sec)