In SQL NULL is a special value, or mark, that is used to indicate the absence of any data value. And this is the case with SQL Server.
But SQL Server also has a special use of NULL in certain cases.
Two Different NULL Meanings?
When we see NULL in SQL Server (or any other RDBMS), we usually know that this means there’s no data, and that the value is therefore unknown.
But when we use the GROUP BY clause with an option such as ROLLUP, CUBE, or GROUPING SETS, we can get NULLs that have a different meaning.
The NULL returned by these options is a special use of NULL. Microsoft explains it as follows:
The
NULLreturned as the result of aROLLUP,CUBEorGROUPING SETSoperation is a special use ofNULL. This acts as a column placeholder in the result set and means all.
So when we get NULL as a result of ROLLUP, CUBE or GROUPING SETS, the NULL means “all”. The rows produced by these options are aggregates, and the NULL is basically a placeholder for the values of all rows that make up the aggregate.
Example
Suppose we have the following table:
SELECT * FROM Dogs;
Result:
DogName Activity Score ------- ----------- ----- Bruno Fetch Stick 43 Cooper Fetch Stick 43 Max Fetch Stick 91 Bruno Wag Tail 65 Cooper Wag Tail 51 Max Wag Tail 87 Bruno Keep Quiet 3 Cooper Keep Quiet null Max Keep Quiet 1 null Catch Cat null
This is a table of dogs and their scores at performing various activities.
We can see right from the outset that the last row contains null in two columns. These NULL values are in the underlying data and they indicate that there’s no data value in that row for that column.
Now let’s run a query against that table that uses GROUP BY ROLLUP:
SELECT
Activity,
DogName,
SUM(Score) AS "Score"
FROM Dogs
GROUP BY ROLLUP (Activity, DogName);
Result:
Activity DogName Score ----------- ------- ----- Catch Cat null null Catch Cat null null Fetch Stick Bruno 43 Fetch Stick Cooper 43 Fetch Stick Max 91 Fetch Stick null 177 Keep Quiet Bruno 3 Keep Quiet Cooper null Keep Quiet Max 1 Keep Quiet null 4 Wag Tail Bruno 65 Wag Tail Cooper 51 Wag Tail Max 87 Wag Tail null 203 null null 384
Here I’m using the SUM() function to calculate a sum of the scores for each activity. I’m using GROUP BY ROLLUP to provide subtotals and a grand total of all activities.
We can see that a whole lot of new NULLs were generated by our ROLLUP operation. The ROLLUP operation generated new rows (for subtotals and a grand total), and so the NULLs are there as a placeholder for all the values that correspond to the aggregate value that’s presented in the Score column.
The NULLs generated by our ROLLUP operation are in the first two columns. That’s because we’re applying our GROUP BY ROLLUP() to those columns.
So we effectively have two different sets of NULL in our results; some of our NULLs indicate that there’s no data, but other NULLs are there as a placeholder for all values that correspond to the aggregate value.
But how do we distinguish between the two?
We can use a technique like the following to distinguish between these different cases of NULLs.
Identifying the Aggregated NULL Values
We can use the GROUPING() function to distinguish the NULL values that are returned by ROLLUP, CUBE or GROUPING SETS from standard NULL values.
The GROUPING() function returns 1 if the column expression is aggregated and 0 if it’s not.
In the following example, I add columns that use GROUPING() to return this value:
SELECT
Activity,
DogName,
SUM(Score) AS "Score",
GROUPING(Activity) AS "Activity Aggregated",
GROUPING(DogName) AS "DogName Aggregated"
FROM Dogs
GROUP BY ROLLUP (Activity, DogName);
Result:
Activity DogName Score Activity Aggregated DogName Aggregated ----------- ------- ----- ------------------- ------------------ Catch Cat null null 0 0 Catch Cat null null 0 1 Fetch Stick Bruno 43 0 0 Fetch Stick Cooper 43 0 0 Fetch Stick Max 91 0 0 Fetch Stick null 177 0 1 Keep Quiet Bruno 3 0 0 Keep Quiet Cooper null 0 0 Keep Quiet Max 1 0 0 Keep Quiet null 4 0 1 Wag Tail Bruno 65 0 0 Wag Tail Cooper 51 0 0 Wag Tail Max 87 0 0 Wag Tail null 203 0 1 null null 384 1 1
We can see whether a NULL value is aggregated or not by looking at our new columns. A value of 1 means that it is aggregated and 0 means that it’s not.
We could take it a step further and use the IIF() function to replace any aggregated NULLs with a string:
SELECT
IIF(GROUPING(Activity) = 1, 'All Activities', Activity) AS "Activity",
IIF(GROUPING(DogName) = 1, 'All Dogs', DogName) AS "DogName",
SUM(Score) AS "Score"
FROM Dogs
GROUP BY ROLLUP (Activity, DogName);
Result:
Activity DogName Score -------------- -------- ----- Catch Cat null null Catch Cat All Dogs null Fetch Stick Bruno 43 Fetch Stick Cooper 43 Fetch Stick Max 91 Fetch Stick All Dogs 177 Keep Quiet Bruno 3 Keep Quiet Cooper null Keep Quiet Max 1 Keep Quiet All Dogs 4 Wag Tail Bruno 65 Wag Tail Cooper 51 Wag Tail Max 87 Wag Tail All Dogs 203 All Activities All Dogs 384
Now, any remaining NULLs are those that are in the underlying data, or those that are derived from NULLs in the underlying data.