In MySQL, the GROUPING()
function allows us to identity which rows have been generated by the WITH ROLLUP
modifier of the GROUP BY
clause.
By default, WITH ROLLUP
outputs NULL
as the value that represents all aggregate values. By this, I mean it doesn’t provide us with a nice easy to read label. It simply outputs NULL
. This makes it more difficult for us to distinguish between normal rows and super aggregate rows that were generated by WITH ROLLUP
.
This is where the GROUPING()
function can help. The GROUPING()
function allows us to identity which rows were generated by WITH ROLLUP
.
The GROUPING()
function returns a 1
or 0
, depending on whether or not the NULL
value is due to the row representing a super-aggregate value. Specifically, it returns 1
if the value in the current row is a NULL
that represents a super-aggregate value. Otherwise, it returns 0
, which indicates that the value is either a NULL
for a regular result row or is simply not NULL
.
Syntax
The syntax for the GROUPING()
function goes like this:
GROUPING(expr [, expr] ...)
So we must provide at least one expression (depicted by expr
), and we can optionally provide more.
Each expr
must be an expression that exactly matches an expression in the GROUP BY
clause.
Example
Suppose we run the following query:
SELECT
District,
Name,
Population
FROM Cities
WHERE countrycode = 'NZL';
Result:
+------------+--------------+------------+ | District | Name | Population | +------------+--------------+------------+ | Auckland | Auckland | 381800 | | Canterbury | Christchurch | 324200 | | NULL | Manukau | 281800 | | Auckland | North Shore | 187700 | | Auckland | Waitakere | 170600 | | Wellington | Wellington | 166700 | | Dunedin | Dunedin | 119600 | | Hamilton | Hamilton | 117100 | | Wellington | Lower Hutt | 98100 | +------------+--------------+------------+
This is just a list of cities and their populations, along with the district that each city belongs to. Note that this is only test data and in no way does it reflect the actual populations of these cities.
For our purposes, notice that the third row contains NULL
in the District
column.
Now, let’s modify our query a bit:
SELECT
District,
Name,
SUM(Population) AS Population
FROM Cities
WHERE CountryCode = 'NZL'
GROUP BY District, Name WITH ROLLUP;
Result:
+------------+--------------+------------+ | District | Name | Population | +------------+--------------+------------+ | NULL | Manukau | 281800 | | NULL | NULL | 281800 | | Auckland | Auckland | 381800 | | Auckland | North Shore | 187700 | | Auckland | Waitakere | 170600 | | Auckland | NULL | 740100 | | Canterbury | Christchurch | 324200 | | Canterbury | NULL | 324200 | | Dunedin | Dunedin | 119600 | | Dunedin | NULL | 119600 | | Hamilton | Hamilton | 117100 | | Hamilton | NULL | 117100 | | Wellington | Lower Hutt | 98100 | | Wellington | Wellington | 166700 | | Wellington | NULL | 264800 | | NULL | NULL | 1847600 | +------------+--------------+------------+
In this case we used the GROUP BY
clause with the WITH ROLLUP
modifier to return the same list of cities, but this time with rows that display super aggregate data for the Population
column. Specifically, our query now returns subtotals, totals, and a grand total population for each city, district, and the whole country.
Now, although the WITH ROLLUP
modifier provides us with meaningful data in the Population
column, it produces NULL
in the column for which that population represents. This may or may not be an issue, depending on how we’re using the results. However, this is where the GROUPING()
function can be beneficial.
As mentioned, the GROUPING()
function returns a 1
or 0
, depending on whether or not the NULL
value is due to the row representing a super-aggregate value. Therefore, we can use GROUPING()
to tell us which NULL
values are a result of the WITH ROLLUP
function. This allows us to distinguish between super aggregate rows and regular NULL
rows.
Here’s an example of using the GROUPING()
function in the above query:
SELECT
District,
Name,
SUM(Population) AS Population,
GROUPING(District),
GROUPING(Name)
FROM Cities
WHERE CountryCode = 'NZL'
GROUP BY District, Name WITH ROLLUP;
Result:
+------------+--------------+------------+--------------------+----------------+ | District | Name | Population | GROUPING(District) | GROUPING(Name) | +------------+--------------+------------+--------------------+----------------+ | NULL | Manukau | 281800 | 0 | 0 | | NULL | NULL | 281800 | 0 | 1 | | Auckland | Auckland | 381800 | 0 | 0 | | Auckland | North Shore | 187700 | 0 | 0 | | Auckland | Waitakere | 170600 | 0 | 0 | | Auckland | NULL | 740100 | 0 | 1 | | Canterbury | Christchurch | 324200 | 0 | 0 | | Canterbury | NULL | 324200 | 0 | 1 | | Dunedin | Dunedin | 119600 | 0 | 0 | | Dunedin | NULL | 119600 | 0 | 1 | | Hamilton | Hamilton | 117100 | 0 | 0 | | Hamilton | NULL | 117100 | 0 | 1 | | Wellington | Lower Hutt | 98100 | 0 | 0 | | Wellington | Wellington | 166700 | 0 | 0 | | Wellington | NULL | 264800 | 0 | 1 | | NULL | NULL | 1847600 | 1 | 1 | +------------+--------------+------------+--------------------+----------------+
Here, I added two columns to the result set. These columns use GROUPING()
to tell us whether the NULL
value in that column represents a super aggregate or not.
Take a look at the GROUPING(Name)
column. Some rows contain 0
and others contain 1
.
Here’s why:
- The rows that contain
0
are not super aggregate rows. - The rows that contain
1
are super aggregate rows. That is, theirNULL
value was produced by theWITH ROLLUP
modifier. We can see this by looking at the data in the other columns. We can see that thePopulation
column calculates super aggregate totals from the rows in the group. The last row calculates the total population across all cities, and so we can see a1
in both theGROUPING(Name)
column and theGROUPING(District)
column. That simply means that theNULL
values are there because its a super aggregate row.
Usage Example
We can use the information returned by GROUPING()
to provide more meaningful query results. For example, we can do the following:
SELECT
IF(GROUPING(District), 'All Districts', District) AS District,
IF(GROUPING(Name), 'All Cities', Name) As CityName,
SUM(Population) AS Population
FROM Cities
WHERE CountryCode = 'NZL'
GROUP BY District, Name WITH ROLLUP;
Result:
+---------------+--------------+------------+ | District | CityName | Population | +---------------+--------------+------------+ | NULL | Manukau | 281800 | | NULL | All Cities | 281800 | | Auckland | Auckland | 381800 | | Auckland | North Shore | 187700 | | Auckland | Waitakere | 170600 | | Auckland | All Cities | 740100 | | Canterbury | Christchurch | 324200 | | Canterbury | All Cities | 324200 | | Dunedin | Dunedin | 119600 | | Dunedin | All Cities | 119600 | | Hamilton | Hamilton | 117100 | | Hamilton | All Cities | 117100 | | Wellington | Lower Hutt | 98100 | | Wellington | Wellington | 166700 | | Wellington | All Cities | 264800 | | All Districts | All Cities | 1847600 | +---------------+--------------+------------+
Here, we used the IF()
function to check the output of GROUPING()
. If the output of GROUPING()
is 1
, then we provide some user friendly text (i.e. either All Districts
or All Cities
, depending on the column). Otherwise we don’t change anything.
In our case, we still get NULL
values in the result because the underlying data contains a NULL
value. We can modify our query to cater for these cases:
SELECT
IF(GROUPING(District), 'All Districts', IFNULL(District, 'N/A')) AS District,
IF(GROUPING(Name), 'All Cities', IFNULL(Name, 'N/A')) As CityName,
SUM(Population) AS Population
FROM Cities
WHERE CountryCode = 'NZL'
GROUP BY District, Name WITH ROLLUP;
Result:
+---------------+--------------+------------+ | District | CityName | Population | +---------------+--------------+------------+ | N/A | Manukau | 281800 | | N/A | All Cities | 281800 | | Auckland | Auckland | 381800 | | Auckland | North Shore | 187700 | | Auckland | Waitakere | 170600 | | Auckland | All Cities | 740100 | | Canterbury | Christchurch | 324200 | | Canterbury | All Cities | 324200 | | Dunedin | Dunedin | 119600 | | Dunedin | All Cities | 119600 | | Hamilton | Hamilton | 117100 | | Hamilton | All Cities | 117100 | | Wellington | Lower Hutt | 98100 | | Wellington | Wellington | 166700 | | Wellington | All Cities | 264800 | | All Districts | All Cities | 1847600 | +---------------+--------------+------------+
Our query now returns N/A
whenever the underlying data contains NULL
in either the District
or Name
columns. We achieved this outcome with the IFNULL()
function, which checks for the existence of NULL
values, and allows us to specify text to output whenever those NULL
s are encountered.
More Information
See the MySQL documentation for more information.