Create a Window Function in SQL

Most of the major RDBMSs allow us to create window functions in SQL queries.

A window function is a special aggregate-like function that enables partitioning and ordering of data within a result set. Unlike regular aggregate functions, which collapse multiple rows into a single result, window functions retain individual rows in the output, augmenting them with additional computed values.

The following example demonstrates how to create a window function in SQL.

Example

Suppose we have a table with the following data:

SELECT 
    Name,
    District, 
    Population
FROM City 
WHERE CountryCode = 'AUS';

Result:

+---------------+-----------------+------------+
| Name          | District        | Population |
+---------------+-----------------+------------+
| Sydney        | New South Wales |    3276207 |
| Melbourne     | Victoria        |    2865329 |
| Brisbane      | Queensland      |    1291117 |
| Perth         | West Australia  |    1096829 |
| Adelaide      | South Australia |     978100 |
| Canberra      | Capital Region  |     322723 |
| Gold Coast    | Queensland      |     311932 |
| Newcastle     | New South Wales |     270324 |
| Central Coast | New South Wales |     227657 |
| Wollongong    | New South Wales |     219761 |
| Hobart        | Tasmania        |     126118 |
| Geelong       | Victoria        |     125382 |
| Townsville    | Queensland      |     109914 |
| Cairns        | Queensland      |      92273 |
+---------------+-----------------+------------+
14 rows in set (0.00 sec)

This table contains a list of cities around the world. In this example I narrowed the results to just Australian cities (i.e. those with a CountryCode of AUS).

Let’s modify the above query to use a window function:

SELECT 
    District, 
    Name, 
    Population AS "City Population", 
    SUM(Population) OVER(PARTITION BY District) AS "District Population" 
FROM City 
WHERE CountryCode = 'AUS' 
ORDER BY District, Name, "City Population";

Result:

+-----------------+---------------+-----------------+---------------------+
| District        | Name          | City Population | District Population |
+-----------------+---------------+-----------------+---------------------+
| Capital Region  | Canberra      |          322723 |              322723 |
| New South Wales | Central Coast |          227657 |             3993949 |
| New South Wales | Newcastle     |          270324 |             3993949 |
| New South Wales | Sydney        |         3276207 |             3993949 |
| New South Wales | Wollongong    |          219761 |             3993949 |
| Queensland      | Brisbane      |         1291117 |             1805236 |
| Queensland      | Cairns        |           92273 |             1805236 |
| Queensland      | Gold Coast    |          311932 |             1805236 |
| Queensland      | Townsville    |          109914 |             1805236 |
| South Australia | Adelaide      |          978100 |              978100 |
| Tasmania        | Hobart        |          126118 |              126118 |
| Victoria        | Geelong       |          125382 |             2990711 |
| Victoria        | Melbourne     |         2865329 |             2990711 |
| West Australia  | Perth         |         1096829 |             1096829 |
+-----------------+---------------+-----------------+---------------------+
14 rows in set (0.01 sec)

We create window functions with the OVER clause. Here, I applied the OVER clause to the SQL SUM() function.

Here, we used PARTITION BY District in order to partition the result set by the District column. This enabled us to get the total population of each district, based on the sum of the cities within each respective district. The same value is replicated across each row in each district. This is to be expected given we’re outputting all cities within each district.

The PARTITION BY clause is one of several optional arguments we can provide with the OVER clause. We can also provide an empty OVER clause (i.e. an OVER clause without any arguments).

Window Function Syntax

The basic syntax for creating window functions goes something like this:

OVER ( window_spec | window_name )

Where:

  • window_spec is the actual specification for the window function
  • window_name is the name of a window that’s defined in a WINDOW clause elsewhere in the query

Which basically means that we have the option of providing the actual specification for the window function within the OVER clause (like we did in the above example), or providing the name of a named window that contains the specification.

Here’s the basic syntax for providing the specification within the actual OVER clause:

OVER (   
       [ <PARTITION BY clause> ]  
       [ <ORDER BY clause> ]   
       [ <FRAME clause> ]  
      )  

So it consists of an OVER clause, followed by parentheses, which can contain other optional clauses.

Here’s a basic explanation of the above sub-clauses of the OVER clause:

  • <PARTITION BY clause> – Divides the result set into partitions.
  • <ORDER BY clause> – Defines the logical order of the rows within each partition of the result set.
  • <FRAME clause> – Creates frames within the partitions. A frame is a subset of the current partition, and the frame clause specifies how this subset is defined. The frame clause is typically defined in the form of a ROWS, RANGE, or GROUP clause (or similar, depending on the DBMS).

The exact syntax may depend on your DBMS, so be sure to check with the DBMS’s documentation.

Window functions can usually be placed inside the SELECT list, or in the ORDER BY clause.

More Examples

See the following articles for more examples of window functions: