We can use SQL to create a running total of a given column. By this I mean, we can create a column that calculates the cumulative sum of a given column.
The running total/cumulative sum is the sum of all rows up until the current row. The result is that the sum increases (or decreases in the case of negative values) with each row returned in the result set. Each row’s value is added to the cumulative amount from all prior rows, so for any given row, we get the total sum for all rows up to that point – the “running total”.
Example
We can use a window function to compute the running total.
In this example we’ll use the world
sample database, which contains data such as countries, districts, cities, and city population.
These examples were tested in MySQL, but they should work in most major RDBMSs. That said, you should check your RDBMS’s documentation for window functions in case it uses a different syntax.
Sample Data
We have a table called City
with the following data:
SELECT * FROM City
WHERE CountryCode = 'AUS';
Result:
+-----+---------------+-------------+-----------------+------------+ | ID | Name | CountryCode | District | Population | +-----+---------------+-------------+-----------------+------------+ | 130 | Sydney | AUS | New South Wales | 3276207 | | 131 | Melbourne | AUS | Victoria | 2865329 | | 132 | Brisbane | AUS | Queensland | 1291117 | | 133 | Perth | AUS | West Australia | 1096829 | | 134 | Adelaide | AUS | South Australia | 978100 | | 135 | Canberra | AUS | Capital Region | 322723 | | 136 | Gold Coast | AUS | Queensland | 311932 | | 137 | Newcastle | AUS | New South Wales | 270324 | | 138 | Central Coast | AUS | New South Wales | 227657 | | 139 | Wollongong | AUS | New South Wales | 219761 | | 140 | Hobart | AUS | Tasmania | 126118 | | 141 | Geelong | AUS | Victoria | 125382 | | 142 | Townsville | AUS | Queensland | 109914 | | 143 | Cairns | AUS | Queensland | 92273 | +-----+---------------+-------------+-----------------+------------+ 14 rows in set (0.00 sec)
This table contains city names and their populations, as well as the district and country (country code) that each city belongs to.
I should point out that these populations don’t reflect the actual populations in those cities. Nor does it represent all cities in the respective districts. This is just a sample database with sample data.
Calculate a Running Total
Here’s an example that calculates a running total for the population column:
SELECT
District,
Name,
Population,
SUM(Population)
OVER (
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS "Running Total"
FROM City
WHERE CountryCode = 'AUS'
ORDER BY District;
Result:
+-----------------+---------------+------------+---------------+ | District | Name | Population | Running Total | +-----------------+---------------+------------+---------------+ | Capital Region | Canberra | 322723 | 322723 | | New South Wales | Sydney | 3276207 | 3598930 | | New South Wales | Newcastle | 270324 | 3869254 | | New South Wales | Central Coast | 227657 | 4096911 | | New South Wales | Wollongong | 219761 | 4316672 | | Queensland | Brisbane | 1291117 | 5607789 | | Queensland | Gold Coast | 311932 | 5919721 | | Queensland | Townsville | 109914 | 6029635 | | Queensland | Cairns | 92273 | 6121908 | | South Australia | Adelaide | 978100 | 7100008 | | Tasmania | Hobart | 126118 | 7226126 | | Victoria | Melbourne | 2865329 | 10091455 | | Victoria | Geelong | 125382 | 10216837 | | West Australia | Perth | 1096829 | 11313666 | +-----------------+---------------+------------+---------------+ 14 rows in set (0.00 sec)
Here, each city’s population is returned in the Population
column, and the running total is in the Running Total
column.
We can see that each row in the Running Total
column increments by the same amount that’s in the Population
column. For example, if we add the amounts in the first two rows (322723
+ 3276207
) we get 3598930
. And if we add the third row (270324
) to that, we get 3869254
, and so on.
To achieve this result, I used a window function to compute the running total. I used the SUM()
function with an OVER
clause. The bit that goes ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
is a frame clause that specifies the rows to use when calculating the running total.
- The
UNBOUNDED PRECEDING
bit means the first partition row. Given we haven’t specified any partitions, this is the first row of the result set. - The
CURRENT ROW
bit means the current row. This is because we specifiedROWS
. If we had specifiedRANGE
, thenCURRENT ROW
would set the bound to the peers of the current row.
We can also shorten the frame clause to just ROWS UNBOUNDED PRECEDING
to achieve the same result:
SELECT
District,
Name,
Population,
SUM(Population)
OVER (
ROWS UNBOUNDED PRECEDING
) AS "Running Total"
FROM City
WHERE CountryCode = 'AUS'
ORDER BY District;
Result:
+-----------------+---------------+------------+---------------+ | District | Name | Population | Running Total | +-----------------+---------------+------------+---------------+ | Capital Region | Canberra | 322723 | 322723 | | New South Wales | Sydney | 3276207 | 3598930 | | New South Wales | Newcastle | 270324 | 3869254 | | New South Wales | Central Coast | 227657 | 4096911 | | New South Wales | Wollongong | 219761 | 4316672 | | Queensland | Brisbane | 1291117 | 5607789 | | Queensland | Gold Coast | 311932 | 5919721 | | Queensland | Townsville | 109914 | 6029635 | | Queensland | Cairns | 92273 | 6121908 | | South Australia | Adelaide | 978100 | 7100008 | | Tasmania | Hobart | 126118 | 7226126 | | Victoria | Melbourne | 2865329 | 10091455 | | Victoria | Geelong | 125382 | 10216837 | | West Australia | Perth | 1096829 | 11313666 | +-----------------+---------------+------------+---------------+ 14 rows in set (0.00 sec)
Running Total within Each Category
We can modify our query to get the running total within each category. In our case, the District
column can be our “category”:
SELECT
District,
Name,
Population AS "City Pop",
SUM(Population)
OVER (
PARTITION BY District
ORDER BY Population
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS "District Running Total",
SUM(Population)
OVER (
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS "Country Running Total"
FROM City
WHERE CountryCode = 'AUS'
ORDER BY District;
Result:
+-----------------+---------------+----------+------------------------+-----------------------+ | District | Name | City Pop | District Running Total | Country Running Total | +-----------------+---------------+----------+------------------------+-----------------------+ | Capital Region | Canberra | 322723 | 322723 | 322723 | | New South Wales | Wollongong | 219761 | 219761 | 542484 | | New South Wales | Central Coast | 227657 | 447418 | 770141 | | New South Wales | Newcastle | 270324 | 717742 | 1040465 | | New South Wales | Sydney | 3276207 | 3993949 | 4316672 | | Queensland | Cairns | 92273 | 92273 | 4408945 | | Queensland | Townsville | 109914 | 202187 | 4518859 | | Queensland | Gold Coast | 311932 | 514119 | 4830791 | | Queensland | Brisbane | 1291117 | 1805236 | 6121908 | | South Australia | Adelaide | 978100 | 978100 | 7100008 | | Tasmania | Hobart | 126118 | 126118 | 7226126 | | Victoria | Geelong | 125382 | 125382 | 7351508 | | Victoria | Melbourne | 2865329 | 2990711 | 10216837 | | West Australia | Perth | 1096829 | 1096829 | 11313666 | +-----------------+---------------+----------+------------------------+-----------------------+ 14 rows in set (0.00 sec)
This time we have two running totals:
- The
District Running Total
column calculates the running total within each district. - The
Country Running Total
column is the same as the previous example. It simply calculates the running total of the whole result set.
To achieve the values in the District Running Total
column, we partitioned the results by the District
column (and we also ordered it by the Population
column).
Changing the Order within Each Partition
When we calculated the running total within each district, our OVER
clause contained an ORDER BY
clause. This allowed us specify which order to use when calculating the cumulative sum within each partition.
We can change this order as required.
Example:
SELECT
District,
Name,
Population AS "City Pop",
SUM(Population)
OVER (
PARTITION BY District
ORDER BY Population DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS "District Running Total",
SUM(Population)
OVER (
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS "Country Running Total"
FROM City
WHERE CountryCode = 'AUS'
ORDER BY District;
Result:
+-----------------+---------------+----------+------------------------+-----------------------+ | District | Name | City Pop | District Running Total | Country Running Total | +-----------------+---------------+----------+------------------------+-----------------------+ | Capital Region | Canberra | 322723 | 322723 | 322723 | | New South Wales | Sydney | 3276207 | 3276207 | 3598930 | | New South Wales | Newcastle | 270324 | 3546531 | 3869254 | | New South Wales | Central Coast | 227657 | 3774188 | 4096911 | | New South Wales | Wollongong | 219761 | 3993949 | 4316672 | | Queensland | Brisbane | 1291117 | 1291117 | 5607789 | | Queensland | Gold Coast | 311932 | 1603049 | 5919721 | | Queensland | Townsville | 109914 | 1712963 | 6029635 | | Queensland | Cairns | 92273 | 1805236 | 6121908 | | South Australia | Adelaide | 978100 | 978100 | 7100008 | | Tasmania | Hobart | 126118 | 126118 | 7226126 | | Victoria | Melbourne | 2865329 | 2865329 | 10091455 | | Victoria | Geelong | 125382 | 2990711 | 10216837 | | West Australia | Perth | 1096829 | 1096829 | 11313666 | +-----------------+---------------+----------+------------------------+-----------------------+ 14 rows in set (0.00 sec)
Here I changed it so that the district’s running total is ordered by population in descending order. To do this, I used ORDER BY Population DESC
in the OVER
clause. The previous query omitted the DESC
part, and so defaulted to ASC
for ascending order.