3 Easy Ways to Calculate the Median Value from a List in DuckDB

While we might typically use DuckDB’s median() function to get the median value from a column, when it comes to getting the median value from a list, we need to modify this slightly. A list can contain many values, and so we need a way to calculate the median of all of those values, rather than a single value like we’d normally be doing when calculating the median value from a column.

Fortunately, DuckDB provides us with several easy options for achieving this. Below are three options for getting the median value from a list in DuckDB.

Continue reading

Using LIST_INTERSECT() to Extract Overlapping Elements from Two Lists in DuckDB

The list_intersect() can be quite a handy function when working with lists in DuckDB. It accepts two lists as arguments, and it returns a list of elements that are common to both lists. It essentially performs a set intersection operation while preserving the list data type.

Duplicate values are ignored, so only one value is returned even if there are multiple occurrences in one or both lists.

Continue reading

An Important Consideration When Choosing Between LIST_CONCAT() and || For Concatenating Lists in DuckDB

When it comes to concatenating lists in DuckDB, we can use the list_concat() function (and its aliases), or the list concatenation operator (||). Often when we have a choice like this, we can use either option interchangeably. But that isn’t quite true in this case.

There’s an important difference between list_concat() and the || operator in DuckDB, and it all comes down to how it handles NULL values.

Continue reading

A Quick Look at the LIST_AGGREGATE() Function in DuckDB

DuckDB’s list_aggregate() function is a handy tool for performing grouped aggregations over lists. It allows us to apply any aggregate function (like sum, avg, min, max, count, etc.) to a list column as if each list item were a row.

The way it works is that we pass the list as the first argument, followed by the name of an aggregate function we want to apply to that list. The function will return its result as if it were the named aggregate function.

Continue reading