Most of the major RDBMSs allow us to create window functions in SQL queries.
A window function is a special aggregate-like function that enables partitioning and ordering of data within a result set. Unlike regular aggregate functions, which collapse multiple rows into a single result, window functions retain individual rows in the output, augmenting them with additional computed values.
The following example demonstrates how to create a window function in SQL.
Example
Suppose we have a table with the following data:
SELECT
Name,
District,
Population
FROM City
WHERE CountryCode = 'AUS';
Result:
+---------------+-----------------+------------+ | Name | District | Population | +---------------+-----------------+------------+ | Sydney | New South Wales | 3276207 | | Melbourne | Victoria | 2865329 | | Brisbane | Queensland | 1291117 | | Perth | West Australia | 1096829 | | Adelaide | South Australia | 978100 | | Canberra | Capital Region | 322723 | | Gold Coast | Queensland | 311932 | | Newcastle | New South Wales | 270324 | | Central Coast | New South Wales | 227657 | | Wollongong | New South Wales | 219761 | | Hobart | Tasmania | 126118 | | Geelong | Victoria | 125382 | | Townsville | Queensland | 109914 | | Cairns | Queensland | 92273 | +---------------+-----------------+------------+ 14 rows in set (0.00 sec)
This table contains a list of cities around the world. In this example I narrowed the results to just Australian cities (i.e. those with a CountryCode
of AUS
).
Let’s modify the above query to use a window function:
SELECT
District,
Name,
Population AS "City Population",
SUM(Population) OVER(PARTITION BY District) AS "District Population"
FROM City
WHERE CountryCode = 'AUS'
ORDER BY District, Name, "City Population";
Result:
+-----------------+---------------+-----------------+---------------------+ | District | Name | City Population | District Population | +-----------------+---------------+-----------------+---------------------+ | Capital Region | Canberra | 322723 | 322723 | | New South Wales | Central Coast | 227657 | 3993949 | | New South Wales | Newcastle | 270324 | 3993949 | | New South Wales | Sydney | 3276207 | 3993949 | | New South Wales | Wollongong | 219761 | 3993949 | | Queensland | Brisbane | 1291117 | 1805236 | | Queensland | Cairns | 92273 | 1805236 | | Queensland | Gold Coast | 311932 | 1805236 | | Queensland | Townsville | 109914 | 1805236 | | South Australia | Adelaide | 978100 | 978100 | | Tasmania | Hobart | 126118 | 126118 | | Victoria | Geelong | 125382 | 2990711 | | Victoria | Melbourne | 2865329 | 2990711 | | West Australia | Perth | 1096829 | 1096829 | +-----------------+---------------+-----------------+---------------------+ 14 rows in set (0.01 sec)
We create window functions with the OVER
clause. Here, I applied the OVER
clause to the SQL SUM()
function.
Here, we used PARTITION BY District
in order to partition the result set by the District
column. This enabled us to get the total population of each district, based on the sum of the cities within each respective district. The same value is replicated across each row in each district. This is to be expected given we’re outputting all cities within each district.
The PARTITION BY
clause is one of several optional arguments we can provide with the OVER
clause. We can also provide an empty OVER
clause (i.e. an OVER
clause without any arguments).
Window Function Syntax
The basic syntax for creating window functions goes something like this:
OVER ( window_spec | window_name )
Where:
window_spec
is the actual specification for the window functionwindow_name
is the name of a window that’s defined in aWINDOW
clause elsewhere in the query
Which basically means that we have the option of providing the actual specification for the window function within the OVER
clause (like we did in the above example), or providing the name of a named window that contains the specification.
Here’s the basic syntax for providing the specification within the actual OVER
clause:
OVER (
[ <PARTITION BY clause> ]
[ <ORDER BY clause> ]
[ <FRAME clause> ]
)
So it consists of an OVER
clause, followed by parentheses, which can contain other optional clauses.
Here’s a basic explanation of the above sub-clauses of the OVER
clause:
<PARTITION BY clause>
– Divides the result set into partitions.<ORDER BY clause>
– Defines the logical order of the rows within each partition of the result set.<FRAME clause>
– Creates frames within the partitions. A frame is a subset of the current partition, and the frame clause specifies how this subset is defined. The frame clause is typically defined in the form of aROWS
,RANGE
, orGROUP
clause (or similar, depending on the DBMS).
The exact syntax may depend on your DBMS, so be sure to check with the DBMS’s documentation.
Window functions can usually be placed inside the SELECT
list, or in the ORDER BY
clause.
More Examples
See the following articles for more examples of window functions: