In MySQL, the UNION
clause combines the results from multiple queries into a single result set.
Example
Suppose we have the following tables:
SELECT * FROM Teachers;
SELECT * FROM Students;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 1 | Warren | | 2 | Ben | | 3 | Cathy | | 4 | Cathy | | 5 | Bill | | 6 | Bill | +-----------+-------------+ +-----------+-------------+ | StudentId | StudentName | +-----------+-------------+ | 1 | Faye | | 2 | Jet | | 3 | Spike | | 4 | Ein | | 5 | Warren | | 6 | Bill | +-----------+-------------+
We can insert the UNION
clause between those two SELECT
statements to return all teachers and students:
SELECT TeacherName FROM Teachers
UNION
SELECT StudentName FROM Students;
Result:
+-------------+ | TeacherName | +-------------+ | Warren | | Ben | | Cathy | | Bill | | Faye | | Jet | | Spike | | Ein | +-------------+
The column names are taken from the first SELECT
statement.
By default, the UNION
clause implicitly applies a DISTINCT
operation. In other words, it returns only distinct values by default. So the above results contain just one each of Warren, Cathy and Bill. This is despite the fact that the combined tables actually contain two Warrens, two Cathys, and three Bills (there are two teachers called Cathy, a teacher and a customer called Warren, and two called Bill, as well as one student called Bill).
Here’s an example that explicitly uses the DISTINCT
clause:
SELECT TeacherName FROM Teachers
UNION DISTINCT
SELECT StudentName FROM Students;
Result:
+-------------+ | TeacherName | +-------------+ | Warren | | Ben | | Cathy | | Bill | | Faye | | Jet | | Spike | | Ein | +-------------+
So we get the same result that we got without the DISTINCT
clause.
Include Duplicates
We can use the ALL
keyword to include duplicate values in the results:
SELECT TeacherName FROM Teachers
UNION ALL
SELECT StudentName FROM Students;
Result:
+-------------+ | TeacherName | +-------------+ | Warren | | Ben | | Cathy | | Cathy | | Bill | | Bill | | Faye | | Jet | | Spike | | Ein | | Warren | | Bill | +-------------+
This time we got twelve rows instead of the eight that we got in our first example.
We can see that both Cathys were returned and all three Bills were returned.
TABLE
Statements
From MySQL 8.0.19 we can use the UNION
clause with the TABLE
statement.
Here’s an example:
TABLE Teachers
UNION
TABLE Students;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 1 | Warren | | 2 | Ben | | 3 | Cathy | | 4 | Cathy | | 5 | Bill | | 6 | Bill | | 1 | Faye | | 2 | Jet | | 3 | Spike | | 4 | Ein | | 5 | Warren | +-----------+-------------+
That’s the equivalent of the following query:
SELECT * FROM Teachers
UNION
SELECT * FROM Students;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 1 | Warren | | 2 | Ben | | 3 | Cathy | | 4 | Cathy | | 5 | Bill | | 6 | Bill | | 1 | Faye | | 2 | Jet | | 3 | Spike | | 4 | Ein | | 5 | Warren | +-----------+-------------+
You’ll notice that these statements return more rows than in our first example earlier. That’s because we’re selecting all columns in the table, which results in non-duplicates where there was previously a duplicate. For example, two teachers called Bill are returned here whereas only one was returned in the earlier example. That’s because the TeacherId
columns contain different values, hence the rows are not duplicates.
Using the ORDER BY
Clause in Union Queries
We can use the ORDER BY
clause in each SELECT
statement and/or on the combined UNION
query.
In Each SELECT
Statement
When we use the ORDER BY
clause in the individual SELECT
statements within a UNION
query, we need to enclose each SELECT
statement inside parentheses:
(SELECT * FROM Teachers ORDER BY TeacherName ASC LIMIT 2)
UNION
(SELECT * FROM Students ORDER BY StudentName ASC LIMIT 2);
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 2 | Ben | | 5 | Bill | | 6 | Bill | | 4 | Ein | +-----------+-------------+
Note that when we do this, it doesn’t actually order the results for the output. It only orders the results for the purpose of determining the subset of the selected rows to retrieve when applying the LIMIT
clause.
Therefore, using ORDER BY
without the LIMIT
clause has no effect on the output.
On the Whole UNION
Query
We can also use an ORDER BY
clause on the whole query, so that the whole output is ordered together.
In this example, we take the previous example and order the combined results:
(SELECT * FROM Teachers ORDER BY TeacherName ASC LIMIT 2)
UNION
(SELECT * FROM Students ORDER BY StudentName ASC LIMIT 2)
ORDER BY TeacherName DESC;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 4 | Ein | | 5 | Bill | | 6 | Bill | | 2 | Ben | +-----------+-------------+
Even when not using the ORDER BY
clause within each SELECT
statement, each SELECT
statement should still be in parentheses, and the ORDER BY
clause (or any LIMIT
clause) should be after the last one.
(SELECT * FROM Teachers)
UNION
(SELECT * FROM Students)
ORDER BY TeacherName ASC;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 2 | Ben | | 5 | Bill | | 6 | Bill | | 3 | Cathy | | 4 | Cathy | | 4 | Ein | | 1 | Faye | | 2 | Jet | | 3 | Spike | | 1 | Warren | | 5 | Warren | +-----------+-------------+
Mind you, omitting the parentheses produces the same result as the one with parentheses:
SELECT * FROM Teachers
UNION
SELECT * FROM Students
ORDER BY TeacherName ASC;
Result:
+-----------+-------------+ | TeacherId | TeacherName | +-----------+-------------+ | 2 | Ben | | 5 | Bill | | 6 | Bill | | 3 | Cathy | | 4 | Cathy | | 4 | Ein | | 1 | Faye | | 2 | Jet | | 3 | Spike | | 1 | Warren | | 5 | Warren | +-----------+-------------+
Note that if a column to be sorted uses an alias, then that column must be referenced by its alias (not the column name).
Example:
(SELECT TeacherName t FROM Teachers)
UNION
(SELECT StudentName FROM Students)
ORDER BY t ASC;
Result:
+--------+ | t | +--------+ | Ben | | Bill | | Cathy | | Ein | | Faye | | Jet | | Spike | | Warren | +--------+
Here’s what happens if we don’t use the alias:
(SELECT TeacherName t FROM Teachers)
UNION
(SELECT StudentName FROM Students)
ORDER BY TeacherName ASC;
Result:
ERROR 1054 (42S22): Unknown column 'TeacherName' in 'order clause'
Number of Columns
The number of columns returned by each SELECT
statement must be the same. Therefore, we can’t do the following:
SELECT TeacherName FROM Teachers
UNION
SELECT StudentId, StudentName FROM Students;
Result:
ERROR 1222 (21000): The used SELECT statements have a different number of columns
Data Types
Selected columns listed in corresponding positions of each SELECT
statement should have the same data type. However, if they don’t, the types and lengths of the columns in the UNION
result take into account the values retrieved by all the SELECT
statements.
Here’s what happens if we try to combine the TeacherName
column with the StudentId
column:
SELECT TeacherName FROM Teachers
UNION
SELECT StudentId FROM Students;
Result:
+-------------+ | TeacherName | +-------------+ | Warren | | Ben | | Cathy | | Bill | | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | +-------------+
Some other RDBMSs would produce an error in this instance, but MySQL manages to produce output without error.