Understanding GROUPING SETS() in SQL Server

When using the GROUP BY clause in SQL Server, we can use the GROUPING SETS() option to combine multiple GROUP BY clauses into one GROUP BY clause. When we do this, the results are the equivalent of doing a UNION ALL of the specified groups.

Example

Suppose we have the following table:

SELECT * FROM Dogs;

Result:

DogName  Activity     Score
-------  -----------  -----
Bruno    Fetch Stick  43   
Cooper   Fetch Stick  43   
Max      Fetch Stick  91   
Bruno    Wag Tail     65   
Cooper   Wag Tail     51   
Max      Wag Tail     87   
Bruno    Keep Quiet   3    
Cooper   Keep Quiet   null 
Max      Keep Quiet   1    

This is a table of dogs and their scores at performing various activities.

Here’s an example of a query that uses GROUP BY GROUPING SETS() against that table:

SELECT 
    Activity, 
    DogName,
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY GROUPING SETS (
    ROLLUP (Activity, DogName), 
    CUBE (Activity, DogName)
    );

Result:

Activity     DogName  Score
-----------  -------  -----
Fetch Stick  Bruno    43   
Keep Quiet   Bruno    3    
Wag Tail     Bruno    65   
null         Bruno    111  
Fetch Stick  Cooper   43   
Keep Quiet   Cooper   null 
Wag Tail     Cooper   51   
null         Cooper   94   
Fetch Stick  Max      91   
Keep Quiet   Max      1    
Wag Tail     Max      87   
null         Max      179  
null         null     384  
Fetch Stick  Bruno    43   
Fetch Stick  Cooper   43   
Fetch Stick  Max      91   
Fetch Stick  null     177  
Keep Quiet   Bruno    3    
Keep Quiet   Cooper   null 
Keep Quiet   Max      1    
Keep Quiet   null     4    
Wag Tail     Bruno    65   
Wag Tail     Cooper   51   
Wag Tail     Max      87   
Wag Tail     null     203  
null         null     384  
Fetch Stick  null     177  
Keep Quiet   null     4    
Wag Tail     null     203 

In this example I used the SUM() function to calculate the total scores for each of the dogs, for each activity, and also a grand total for all activities.

I then used GROUPING SETS() to combine a GROUP BY ROLLUP and GROUP BY CUBE into one.

The GROUP BY() Option

There’s also a GROUP BY() option that provides an easy way to get a grand total.

Here’s an example that uses this option:

SELECT 
    Activity,
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY GROUPING SETS (Activity, ());

Result:

Activity     Score
-----------  -----
Fetch Stick  177  
Keep Quiet   4    
Wag Tail     203  
null         384  

In this case I simply got the total scores for each activity, as well as a grand total for all activities.

Without the GROUP BY() option, we would get the following:

SELECT 
    Activity,
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY Activity;

Result:

Activity     Score
-----------  -----
Fetch Stick  177  
Keep Quiet   4    
Wag Tail     203 

So we would get the sub-totals, but not the grand total.

Changing the NULL Values

We can see that the ROLLUP and CUBE operations result in NULL being returned in certain columns for that row.

The NULL returned by ROLLUP, CUBE, and GROUPING SETS is a special use of NULL. Microsoft’s documentation states:

The NULL returned as the result of a ROLLUP, CUBE or GROUPING SETS operation is a special use of NULL. This acts as a column placeholder in the result set and means all.

So in our case, whenever we see null in either the Activity column or the DogName column, we know that it means “all”.

That being said, it’s also entirely possible that a column could contain NULL values that aren’t related to the ROLLUP or CUBE operations. In that case, it could be easy to get confused over which one was produced by ROLLUP or CUBE and which one is simply a NULL value in the underlying data.

Fortunately, we can use the following technique to help us distinguish between these different NULLs:

SELECT 
    IIF(GROUPING(Activity) = 1, 'All Activities', Activity) AS "Activity", 
    IIF(GROUPING(DogName) = 1, 'All Dogs', DogName) AS "DogName",
    SUM(Score) AS "Score"
FROM Dogs  
GROUP BY GROUPING SETS (
    ROLLUP (Activity, DogName), 
    CUBE (Activity, DogName)
    );

Result:

Activity        DogName   Score
--------------  --------  -----
Fetch Stick     Bruno     43   
Keep Quiet      Bruno     3    
Wag Tail        Bruno     65   
All Activities  Bruno     111  
Fetch Stick     Cooper    43   
Keep Quiet      Cooper    null 
Wag Tail        Cooper    51   
All Activities  Cooper    94   
Fetch Stick     Max       91   
Keep Quiet      Max       1    
Wag Tail        Max       87   
All Activities  Max       179  
All Activities  All Dogs  384  
Fetch Stick     Bruno     43   
Fetch Stick     Cooper    43   
Fetch Stick     Max       91   
Fetch Stick     All Dogs  177  
Keep Quiet      Bruno     3    
Keep Quiet      Cooper    null 
Keep Quiet      Max       1    
Keep Quiet      All Dogs  4    
Wag Tail        Bruno     65   
Wag Tail        Cooper    51   
Wag Tail        Max       87   
Wag Tail        All Dogs  203  
All Activities  All Dogs  384  
Fetch Stick     All Dogs  177  
Keep Quiet      All Dogs  4    
Wag Tail        All Dogs  203  

Here I used the GROUPING() function in conjunction with the IIF() function to replace any aggregated NULLs with a string.

The GROUPING() function returns 1 whenever the column expression is aggregated. I therefore passed it to the IIF() function to present a different value, depending on whether it’s aggregated or not.

In this case, the Activity column returns the string All Activities if it’s aggregated, and just the column value if it’s not aggregated. The DogName column returns All Dogs if it’s aggregated, and just the column value if not.