A Special Use of NULL in SQL Server

In SQL NULL is a special value, or mark, that is used to indicate the absence of any data value. And this is the case with SQL Server.

But SQL Server also has a special use of NULL in certain cases.

Two Different NULL Meanings?

When we see NULL in SQL Server (or any other RDBMS), we usually know that this means there’s no data, and that the value is therefore unknown.

But when we use the GROUP BY clause with an option such as ROLLUP, CUBE, or GROUPING SETS, we can get NULLs that have a different meaning.

The NULL returned by these options is a special use of NULL. Microsoft explains it as follows:

The NULL returned as the result of a ROLLUP, CUBE or GROUPING SETS operation is a special use of NULL. This acts as a column placeholder in the result set and means all.

So when we get NULL as a result of ROLLUP, CUBE or GROUPING SETS, the NULL means “all”. The rows produced by these options are aggregates, and the NULL is basically a placeholder for the values of all rows that make up the aggregate.

Example

Suppose we have the following table:

SELECT * FROM Dogs;

Result:

DogName  Activity     Score
-------  -----------  -----
Bruno    Fetch Stick  43   
Cooper   Fetch Stick  43   
Max      Fetch Stick  91   
Bruno    Wag Tail     65   
Cooper   Wag Tail     51   
Max      Wag Tail     87   
Bruno    Keep Quiet   3    
Cooper   Keep Quiet   null 
Max      Keep Quiet   1    
null     Catch Cat    null 

This is a table of dogs and their scores at performing various activities.

We can see right from the outset that the last row contains null in two columns. These NULL values are in the underlying data and they indicate that there’s no data value in that row for that column.

Now let’s run a query against that table that uses GROUP BY ROLLUP:

SELECT 
    Activity, 
    DogName,
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY ROLLUP (Activity, DogName);

Result:

Activity     DogName  Score
-----------  -------  -----
Catch Cat    null     null 
Catch Cat    null     null 
Fetch Stick  Bruno    43   
Fetch Stick  Cooper   43   
Fetch Stick  Max      91   
Fetch Stick  null     177  
Keep Quiet   Bruno    3    
Keep Quiet   Cooper   null 
Keep Quiet   Max      1    
Keep Quiet   null     4    
Wag Tail     Bruno    65   
Wag Tail     Cooper   51   
Wag Tail     Max      87   
Wag Tail     null     203  
null         null     384 

Here I’m using the SUM() function to calculate a sum of the scores for each activity. I’m using GROUP BY ROLLUP to provide subtotals and a grand total of all activities.

We can see that a whole lot of new NULLs were generated by our ROLLUP operation. The ROLLUP operation generated new rows (for subtotals and a grand total), and so the NULLs are there as a placeholder for all the values that correspond to the aggregate value that’s presented in the Score column.

The NULLs generated by our ROLLUP operation are in the first two columns. That’s because we’re applying our GROUP BY ROLLUP() to those columns.

So we effectively have two different sets of NULL in our results; some of our NULLs indicate that there’s no data, but other NULLs are there as a placeholder for all values that correspond to the aggregate value.

But how do we distinguish between the two?

We can use a technique like the following to distinguish between these different cases of NULLs.

Identifying the Aggregated NULL Values

We can use the GROUPING() function to distinguish the NULL values that are returned by ROLLUP, CUBE or GROUPING SETS from standard NULL values.

The GROUPING() function returns 1 if the column expression is aggregated and 0 if it’s not.

In the following example, I add columns that use GROUPING() to return this value:

SELECT 
    Activity, 
    DogName,
    SUM(Score) AS "Score",
    GROUPING(Activity) AS "Activity Aggregated",
    GROUPING(DogName) AS "DogName Aggregated"
FROM Dogs  
GROUP BY ROLLUP (Activity, DogName);

Result:

Activity     DogName  Score  Activity Aggregated  DogName Aggregated
-----------  -------  -----  -------------------  ------------------
Catch Cat    null     null   0                    0                 
Catch Cat    null     null   0                    1                 
Fetch Stick  Bruno    43     0                    0                 
Fetch Stick  Cooper   43     0                    0                 
Fetch Stick  Max      91     0                    0                 
Fetch Stick  null     177    0                    1                 
Keep Quiet   Bruno    3      0                    0                 
Keep Quiet   Cooper   null   0                    0                 
Keep Quiet   Max      1      0                    0                 
Keep Quiet   null     4      0                    1                 
Wag Tail     Bruno    65     0                    0                 
Wag Tail     Cooper   51     0                    0                 
Wag Tail     Max      87     0                    0                 
Wag Tail     null     203    0                    1                 
null         null     384    1                    1                 

We can see whether a NULL value is aggregated or not by looking at our new columns. A value of 1 means that it is aggregated and 0 means that it’s not.

We could take it a step further and use the IIF() function to replace any aggregated NULLs with a string:

SELECT 
    IIF(GROUPING(Activity) = 1, 'All Activities', Activity) AS "Activity", 
    IIF(GROUPING(DogName) = 1, 'All Dogs', DogName) AS "DogName",
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY ROLLUP (Activity, DogName);

Result:

Activity        DogName   Score
--------------  --------  -----
Catch Cat       null      null 
Catch Cat       All Dogs  null 
Fetch Stick     Bruno     43   
Fetch Stick     Cooper    43   
Fetch Stick     Max       91   
Fetch Stick     All Dogs  177  
Keep Quiet      Bruno     3    
Keep Quiet      Cooper    null 
Keep Quiet      Max       1    
Keep Quiet      All Dogs  4    
Wag Tail        Bruno     65   
Wag Tail        Cooper    51   
Wag Tail        Max       87   
Wag Tail        All Dogs  203  
All Activities  All Dogs  384  

Now, any remaining NULLs are those that are in the underlying data, or those that are derived from NULLs in the underlying data.