We can use the OVER()
clause to create window functions in our SQL queries. A window function can be a useful tool that allows us to do things like compute moving averages, rank items, calculate cumulative sums, and much more.
Another common task for a window function might be to compute subtotals.
Example
Here’s an example of using a window function to calculate subtotals for a given category.
In this example, we’ll use the world
sample database, which contains data such as countries, districts, cities, and city population. We’ll use a window function to add up the population of all cities within their respective districts, while presenting the population of each city and district in the same result set.
This example was done in MySQL, but it should work in most major RDBMSs. That said, you should check your RDBMS’s documentation for window functions in case it uses a different syntax.
Sample Data
We have a table called City
with the following data:
SELECT
Name AS "City Name",
Population AS "City Population",
District
FROM City
WHERE CountryCode = 'AUS';
Result:
+---------------+-----------------+-----------------+ | City Name | City Population | District | +---------------+-----------------+-----------------+ | Sydney | 3276207 | New South Wales | | Melbourne | 2865329 | Victoria | | Brisbane | 1291117 | Queensland | | Perth | 1096829 | West Australia | | Adelaide | 978100 | South Australia | | Canberra | 322723 | Capital Region | | Gold Coast | 311932 | Queensland | | Newcastle | 270324 | New South Wales | | Central Coast | 227657 | New South Wales | | Wollongong | 219761 | New South Wales | | Hobart | 126118 | Tasmania | | Geelong | 125382 | Victoria | | Townsville | 109914 | Queensland | | Cairns | 92273 | Queensland | +---------------+-----------------+-----------------+ 14 rows in set (0.00 sec)
This table contains city names and their populations, as well as the district that each city belongs to. We can see that some districts contain multiple cities.
One handy modification to this query might be to add a column that calculates the population of each district, based on the total population of all cities within that district. We could use a window function for that.
Calculate Subtotals
Here’s how we can use a window function to calculate subtotals for the above query:
SELECT
Name AS "City Name",
Population AS "City Population",
District,
SUM(Population)
OVER (
PARTITION BY District
ORDER BY Population
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS "District Population"
FROM City
WHERE CountryCode = 'AUS';
Result:
+---------------+-----------------+-----------------+---------------------+ | City Name | City Population | District | District Population | +---------------+-----------------+-----------------+---------------------+ | Canberra | 322723 | Capital Region | 322723 | | Wollongong | 219761 | New South Wales | 3993949 | | Central Coast | 227657 | New South Wales | 3993949 | | Newcastle | 270324 | New South Wales | 3993949 | | Sydney | 3276207 | New South Wales | 3993949 | | Cairns | 92273 | Queensland | 1805236 | | Townsville | 109914 | Queensland | 1805236 | | Gold Coast | 311932 | Queensland | 1805236 | | Brisbane | 1291117 | Queensland | 1805236 | | Adelaide | 978100 | South Australia | 978100 | | Hobart | 126118 | Tasmania | 126118 | | Geelong | 125382 | Victoria | 2990711 | | Melbourne | 2865329 | Victoria | 2990711 | | Perth | 1096829 | West Australia | 1096829 | +---------------+-----------------+-----------------+---------------------+ 14 rows in set (0.00 sec)
We’ve just calculated the total population of each district, based on the cities within those districts.
I call this a subtotal, because the district population is a subtotal of the total population within each country and even for all countries in the table.
We used the PARTITION BY
clause to partition the results into districts. We could calculate the total population of the whole country by removing the PARTITION BY
clause from the OVER()
clause, or even by providing an empty OVER()
clause. And if we didn’t filter the results to just one country, we could calculate the total population of all countries, or multiple countries as the case may be, and also provide subtotals by partitioning by country and district.
An Alternative
Some RDBMSs allow a WITH ROLLUP
modifier to be attached to the GROUP BY
clause clause. This provides us with an alternative way to get totals and subtotals in our result set. See How to Add Totals and Subtotals When Using GROUP BY
in MySQL for examples.