When using aggregation functions in PostgreSQL, we have the option of including a FILTER
clause. This clause allows us to narrow down – or filter – the rows that are fed to the aggregate function.
This can be be a handy way to provide functionality that we might otherwise need to use a CASE
statement or perhaps a CTE.
In this article, we’ll take a look at PostgreSQL’s FILTER
clause, and see how it can simplify our SQL queries by replacing CASE
statements with more concise code.
What is the FILTER
Clause?
The FILTER
clause is a PostgreSQL extension to the SQL standard that provides a way to perform conditional aggregation. It allows us to specify a condition for each aggregate function, determining which rows should be included in the aggregation.
This feature can significantly simplify our queries and make them more readable compared to using complex CASE
statements or subqueries.
Syntax
The basic syntax of the FILTER
clause is as follows:
aggregate_function(expression) FILTER (WHERE condition)
Here, aggregate_function
is any of PostgreSQL’s aggregate functions (like SUM()
, COUNT()
, AVG()
, etc.), expression
is the column or expression being aggregated, and condition
is the filter we want to apply to the aggregation.
That is a simplified portrayal of the syntax. If you’re interested in the details, the PostgreSQL documentation defines the following syntax:
aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]
aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) [ FILTER ( WHERE filter_clause ) ]
And it can also be used with window functions, where the following syntax applies:
function_name ([expression [, expression ... ]]) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ([expression [, expression ... ]]) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER window_name
function_name ( * ) [ FILTER ( WHERE filter_clause ) ] OVER ( window_definition )
Benefits of Using the FILTER
Clause
As alluded to, there are often multiple ways of achieving the same outcome in SQL, and this is the case with the FILTER
clause. Oftentimes we might be able to get the same results using a CASE
statement or a CTE. However, the FILTER
clause does have its advantages. For example:
- Readability: The
FILTER
clause can make our queries more readable and easier to understand at a glance. - Performance: In many cases, using
FILTER
can be more efficient than alternative methods likeCASE
statements. - Flexibility: We can apply different conditions to multiple aggregate functions in the same query.
Examples
Let’s look at two examples to better understand how the FILTER
clause works in practice.
Example 1: Sales Analysis
Suppose we have a sales
table with columns date
, product
, and amount
. We want to calculate the total sales for each product, but we also want to separate the sales into two categories: sales in the first half of the year and sales in the second half.
Without FILTER
:
SELECT
product,
SUM(CASE WHEN EXTRACT(MONTH FROM date) <= 6 THEN amount ELSE 0 END) AS first_half_sales,
SUM(CASE WHEN EXTRACT(MONTH FROM date) > 6 THEN amount ELSE 0 END) AS second_half_sales
FROM
sales
GROUP BY
product;
With FILTER
:
SELECT
product,
SUM(amount) FILTER (WHERE EXTRACT(MONTH FROM date) <= 6) AS first_half_sales,
SUM(amount) FILTER (WHERE EXTRACT(MONTH FROM date) > 6) AS second_half_sales
FROM
sales
GROUP BY
product;
As we can see, the FILTER
clause makes our query more concise and easier to read.
Example 2: Student Grade Analysis
Let’s consider another scenario where we have a grades
table with columns student_id
, subject
, and grade
. We want to calculate the average grade for each student, but we also want to see the average grade for science subjects and non-science subjects separately.
SELECT
student_id,
AVG(grade) AS overall_average,
AVG(grade) FILTER (WHERE subject IN ('Physics', 'Chemistry', 'Biology')) AS science_average,
AVG(grade) FILTER (WHERE subject NOT IN ('Physics', 'Chemistry', 'Biology')) AS non_science_average
FROM
grades
GROUP BY
student_id;
In this example, we use the FILTER
clause to calculate different averages based on the subject, all in a single, easy-to-read query.
Limitations and Considerations
While the FILTER
clause is powerful, it’s important to note a few limitations:
- The
FILTER
clause is a PostgreSQL-specific feature, so it may not be available in other database systems. - It can only be used with aggregate functions.
- The condition in the
FILTER
clause can only reference columns from the same table as the aggregate function.
More
As mentioned, we can include the FILTER
clause with window functions. See Understanding Window Functions in SQL for an overview of window functions.
And here’s how to create a window function in SQL.
And for more about CASE
statements/expressions, see PostgreSQL CASE
Expression and SQL CASE
Statement.