A Quick Look at DuckDB’s MAX() Function

Most RDBMSs have a max() function and DuckDB is no exception. The max() function is a fundamental aggregate function in SQL that returns the maximum value from a set of values.

This article looks at how the max() function works in DuckDB, along with some simple examples to demonstrate.

Syntax

The max() function can be used in the following two ways:

max(arg)
max(arg, n)

Where arg is name of the column from which you want to find the maximum value.

The n argument is optional. If we provide that, it specifies how many results to return (i.e. the top n maximum values).

Example

Suppose we create and populate the following table:

-- Create the table
CREATE TABLE sales_data (
    id INTEGER,
    amount DECIMAL(10, 2),
    order_date DATE,
    region TEXT
);

-- Insert sample data
INSERT INTO sales_data (id, amount, order_date, region) VALUES
(1, 100.00, '2023-01-01', 'North'),
(2, 200.00, '2023-02-15', 'South'),
(3, 150.00, '2023-03-10', 'North'),
(4, 300.00, '2023-04-22', 'East'),
(5, 250.00, '2023-05-05', 'South'),
(6, 400.00, '2023-06-18', 'East'),
(7, 350.00, '2023-07-30', 'North'),
(8, 500.00, '2023-08-12', 'West'),
(9, 600.00, '2023-09-25', 'West'),
(10, 450.00, '2023-10-10', 'South');

-- Return all data
SELECT * FROM sales_data;

Output:

+----+--------+------------+--------+
| id | amount | order_date | region |
+----+--------+------------+--------+
| 1 | 100.00 | 2023-01-01 | North |
| 2 | 200.00 | 2023-02-15 | South |
| 3 | 150.00 | 2023-03-10 | North |
| 4 | 300.00 | 2023-04-22 | East |
| 5 | 250.00 | 2023-05-05 | South |
| 6 | 400.00 | 2023-06-18 | East |
| 7 | 350.00 | 2023-07-30 | North |
| 8 | 500.00 | 2023-08-12 | West |
| 9 | 600.00 | 2023-09-25 | West |
| 10 | 450.00 | 2023-10-10 | South |
+----+--------+------------+--------+

To find the maximum sales amount, we can use the following query:

SELECT 
    max(amount) AS max_sales
FROM sales_data;

Result:

+-----------+
| max_sales |
+-----------+
| 600.00 |
+-----------+

Return the Top n Maximum Values

We can provide a second argument to specify how many values to return:

SELECT 
    max(amount, 3) AS top_3_sales
FROM sales_data;

Result:

+--------------------------+
| top_3_sales |
+--------------------------+
| [600.00, 500.00, 450.00] |
+--------------------------+

When we do this, the result is returned as a list containing the specified number of maximum values. In this example I specified that the top three maximum values be returned.

Using max() with a WHERE Clause

We can also use the max() function with a WHERE clause to filter the data before finding the maximum value. For example, if we want to find the maximum sales amount for a specific region, we can use:

SELECT 
    max(amount) AS max_sales
FROM sales_data
WHERE region = 'North';

Result:

+-----------+
| max_sales |
+-----------+
| 350.00 |
+-----------+

Finding the Maximum Value in a Date Column

Here’s an example of max() being applied to a date column:

SELECT 
    max(order_date) AS latest_order
FROM sales_data;

Result:

+--------------+
| latest_order |
+--------------+
| 2023-10-10 |
+--------------+

Here are the top 2 dates:

SELECT 
    max(order_date, 2) AS latest_order
FROM sales_data;

Result:

+--------------------------+
| latest_order |
+--------------------------+
| [2023-10-10, 2023-09-25] |
+--------------------------+

Using max() with GROUP BY

The max() function can be combined with the GROUP BY clause to find the maximum value for each group in a table. For example, if we want to find the maximum sales amount for each region in the sales table, we can use:

SELECT 
    region, 
    max(amount) AS max_sales
FROM sales_data
GROUP BY region;

Result:

+--------+-----------+
| region | max_sales |
+--------+-----------+
| North | 350.00 |
| South | 450.00 |
| East | 400.00 |
| West | 600.00 |
+--------+-----------+

This query returns the highest sales amount for each region.

Using max() with Window Functions

DuckDB also supports window functions, which allow us to perform calculations across a set of table rows that are somehow related to the current row.

We can use the max() function as a window function to find the maximum value within a specific window of rows. For example:

SELECT 
    id, 
    amount, 
    region, 
    max(amount) OVER (PARTITION BY region) AS max_sales_in_region
FROM sales_data
ORDER BY region;

Result:

+----+--------+--------+---------------------+
| id | amount | region | max_sales_in_region |
+----+--------+--------+---------------------+
| 4 | 300.00 | East | 400.00 |
| 6 | 400.00 | East | 400.00 |
| 1 | 100.00 | North | 350.00 |
| 3 | 150.00 | North | 350.00 |
| 7 | 350.00 | North | 350.00 |
| 2 | 200.00 | South | 450.00 |
| 5 | 250.00 | South | 450.00 |
| 10 | 450.00 | South | 450.00 |
| 8 | 500.00 | West | 600.00 |
| 9 | 600.00 | West | 600.00 |
+----+--------+--------+---------------------+

This query returns the maximum sales amount for each region, alongside each individual sale.