Get the Most Populous City in Each District When Using a Window Function in SQL

While it’s true that we can use the SQL MAX() function to get the maximum value in a column, what if we want to return the value from another column instead of the actual maximum value itself? And what if we want it partitioned by category, so that it’s based on the maximum value from each category? And what if we want all values listed out, including those that aren’t the most populous?

In the following example, we use a window function to solve this problem.

Example

Here’s an example of a query that does what we want:

SELECT
    District,
    Name,
    Population,
LAST_VALUE(Name) OVER (
    PARTITION BY District
    ORDER BY Population
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS "Most Populous City"
FROM City
WHERE CountryCode = 'AUS';

Result:

+-----------------+---------------+------------+--------------------+
| District        | Name          | Population | Most Populous City |
+-----------------+---------------+------------+--------------------+
| Capital Region  | Canberra      |     322723 | Canberra           |
| New South Wales | Wollongong    |     219761 | Sydney             |
| New South Wales | Central Coast |     227657 | Sydney             |
| New South Wales | Newcastle     |     270324 | Sydney             |
| New South Wales | Sydney        |    3276207 | Sydney             |
| Queensland      | Cairns        |      92273 | Brisbane           |
| Queensland      | Townsville    |     109914 | Brisbane           |
| Queensland      | Gold Coast    |     311932 | Brisbane           |
| Queensland      | Brisbane      |    1291117 | Brisbane           |
| South Australia | Adelaide      |     978100 | Adelaide           |
| Tasmania        | Hobart        |     126118 | Hobart             |
| Victoria        | Geelong       |     125382 | Melbourne          |
| Victoria        | Melbourne     |    2865329 | Melbourne          |
| West Australia  | Perth         |    1096829 | Perth              |
+-----------------+---------------+------------+--------------------+

This query returns all cities, as well as listing out the most populous city within each district. Given we’re returning all cities, the most populous city is repeated in those cases where there’s more than one city within a district.

By “most populous city”, I’m referring to the cities that have the largest population in their respective districts.

In the above example, the Most Populous City column contains this data. For example, we can see Sydney is listed as the most populous city for New South Wales. We can verify this by looking at the Population column. It does indeed have the highest population of all cities in that district. The same is true for Brisbane and Melbourne. For those districts with only one city, that city is listed as the most populous for its district.

Least Populous City

We can also flip it on its head and return the least populous city within a district:

SELECT
    District,
    Name,
    Population,
LAST_VALUE(Name) OVER (
    PARTITION BY District
    ORDER BY Population DESC
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS "Least Populous City"
FROM City
WHERE CountryCode = 'AUS';

Result:

+-----------------+---------------+------------+---------------------+
| District        | Name          | Population | Least Populous City |
+-----------------+---------------+------------+---------------------+
| Capital Region  | Canberra      |     322723 | Canberra            |
| New South Wales | Sydney        |    3276207 | Wollongong          |
| New South Wales | Newcastle     |     270324 | Wollongong          |
| New South Wales | Central Coast |     227657 | Wollongong          |
| New South Wales | Wollongong    |     219761 | Wollongong          |
| Queensland      | Brisbane      |    1291117 | Cairns              |
| Queensland      | Gold Coast    |     311932 | Cairns              |
| Queensland      | Townsville    |     109914 | Cairns              |
| Queensland      | Cairns        |      92273 | Cairns              |
| South Australia | Adelaide      |     978100 | Adelaide            |
| Tasmania        | Hobart        |     126118 | Hobart              |
| Victoria        | Melbourne     |    2865329 | Geelong             |
| Victoria        | Geelong       |     125382 | Geelong             |
| West Australia  | Perth         |    1096829 | Perth               |
+-----------------+---------------+------------+---------------------+

All I did was change it to descending order. Specifically, I changed ORDER BY Population to ORDER BY Population DESC.

Although these examples use the population of cities, they could be modified for other use cases. For example, we could use a similar query to return the most popular products sold in each category, the oldest individuals within each group, the companies with the largest market cap within each industry, etc.