SQL UNION Clause for Beginners

In SQL, the UNION clause concatenates the results of two queries into a single result set.

You can use the UNION clause with or without the ALL argument:

  • UNION ALL – Includes duplicates.
  • UNION – Excludes duplicates.

Some RDBMSs also accept UNION DISTINCT, which is the equivalent to UNION. That is, it excludes duplicates.

Below are some basic examples to demonstrate how it works.

Sample Tables

Suppose we have the following two tables:

Cats:
+---------+-----------+
| CatId   | CatName   |
|---------+-----------|
| 1       | Meow      |
| 2       | Fluffy    |
| 3       | Scratch   |
+---------+-----------+

Dogs:
+---------+-----------+
| DogId   | DogName   |
|---------+-----------|
| 1       | Fetch     |
| 2       | Fluffy    |
| 3       | Wag       |
| 1002    | Fetch     |
+---------+-----------+

We can use a SELECT statement with a UNION clause to combine the results from both tables into one result set.

Example using UNION ALL

First, let’s use UNION ALL so that it includes duplicates.

SELECT DogName AS PetName
FROM Dogs
UNION ALL
SELECT CatName
FROM Cats;

Result:

+-----------+
| PetName   |
|-----------|
| Fetch     |
| Fluffy    |
| Wag       |
| Fetch     |
| Meow      |
| Fluffy    |
| Scratch   |
+-----------+
(7 rows affected)

In this case, seven rows are returned. We can see that “Fetch” is returned twice. This is because there are two dogs named Fetch.

There’s also a cat and a dog with the same name: Fluffy.

Note that I used a column alias to name the field returned by the operation. If I hadn’t done that, the result would have used the column names from the first query. In that case, the column header would have been called DogName instead of PetName.

SELECT DogName
FROM Dogs
UNION ALL
SELECT CatName
FROM Cats;

Result:

+-----------+
| DogName   |
|-----------|
| Fetch     |
| Fluffy    |
| Wag       |
| Fetch     |
| Meow      |
| Fluffy    |
| Scratch   |
+-----------+
(7 rows affected)

This may or may not be acceptable, depending on the data that you’re returning in your query. In our case, it’s not appropriate, because not all results are dogs.

Example using UNION

Let’s see what happens when we remove the ALL argument.

SELECT DogName AS PetName
FROM Dogs
UNION
SELECT CatName
FROM Cats;

Result:

+-----------+
| PetName   |
|-----------|
| Fetch     |
| Fluffy    |
| Meow      |
| Scratch   |
| Wag       |
+-----------+
(5 rows affected)

This time only five rows are returned. Both duplicates are removed.

As mentioned, some RDBMSs accept UNION DISTINCT, which allows us to explicitly remove duplicates (even though they’re implicitly removed with UNION anyway).

UNION vs DISTINCT

Note that this is different to applying DISTINCT to each individual SELECT statement. If we’d done that, Fluffy would have been returned twice, because the ALL would only apply for the SELECT statement that it’s being applied against (not to the concatenated results).

Here’s an example to illustrate what I mean.

SELECT DISTINCT DogName AS PetName
FROM Dogs
UNION ALL
SELECT DISTINCT CatName
FROM Cats;

Result:

+-----------+
| PetName   |
|-----------|
| Fetch     |
| Fluffy    |
| Wag       |
| Fluffy    |
| Meow      |
| Scratch   |
+-----------+
(6 rows affected)

All Queries must Return the Same Number of Columns

When you use the UNION clause, each query must have the same number of columns, and they must be in the same order.

If not, you’ll get an error.

SELECT CatName FROM Cats
UNION ALL
SELECT DogId, DogName FROM Dogs;

Result:

Msg 205, Level 16, State 1, Line 1
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.

That’s the error that SQL Server returns when using an unequal number of columns. This particular error indicates that the same restriction also applies to the INTERSECT and EXCEPT operators. The error message you receive may be different, depending on your DBMS.

Data Types must be Compatible

In addition to requiring the same number of columns, those columns must have a compatible data type.

They don’t necessarily need to be the same data type, but they will need to be compatible. That is, they must be compatible through implicit conversion. If the data types don’t match, the DBMS must be able to do an implicit conversion so that they do match.

If not, you’ll get an error.

SELECT CatName FROM Cats
UNION ALL
SELECT DogId FROM Dogs;

Result:

Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'Meow' to data type int.

Ordering the Results

If you want to sort the results with the ORDER BY clause, you’ll need to put it on the last query. You can’t put a separate ORDER BY clause on each query, or for that matter, any query that isn’t the last one.

Here’s the error I get when trying to do that in SQL Server:

SELECT DogName AS PetName
FROM Dogs
ORDER BY DogName
UNION ALL
SELECT CatName
FROM Cats;

Result:

Msg 156, Level 15, State 1, Line 4
Incorrect syntax near the keyword 'UNION'.

Therefore, if we want to order the results, we’ll need to do something like this:

SELECT DogName AS PetName
FROM Dogs
UNION ALL
SELECT CatName
FROM Cats
ORDER BY PetName;

Applying UNION to the More than two Queries

The previous examples combined results from two different queries, but there’s nothing to stop you adding more. You can use it to combine the results of many queries if required.

For example, if we also had a Birds table, we could do this:

SELECT DogName AS PetName
FROM Dogs
UNION ALL
SELECT CatName
FROM Cats
UNION ALL
SELECT BirdName
FROM Birds;

Normalization

The examples on this page put cats and dogs into two separate tables. The reason i did this is because it’s a clear and concise way of illustrating how UNION works.

In practice, you might have these in the same table called, say Pets, then have a separate PetTypes table (or similar). This is known as normalization, and is the way relational databases are usually designed.

You could then run a join on these tables to return data as required.