The list_intersect()
can be quite a handy function when working with lists in DuckDB. It accepts two lists as arguments, and it returns a list of elements that are common to both lists. It essentially performs a set intersection operation while preserving the list data type.
Duplicate values are ignored, so only one value is returned even if there are multiple occurrences in one or both lists.
Syntax
The syntax goes like this:
list_intersect(list1, list2)
The function returns a list of elements that appear in both list1
and list2
(without duplicates).
The function also has an alias, called array_intersect()
. So it can also be used like this:
array_intersect(list1, list2)
Example
Here’s a simple example to demonstrate:
SELECT list_intersect(
['apple', 'banana', 'cherry'],
['banana', 'cherry', 'date']
);
Result:
[cherry, banana]
The reason cherry
and banana
were returned is because they were in both lists.
Duplicate Values
Duplicate values are ignored:
SELECT list_intersect(
['apple', 'banana', 'cherry', 'cherry'],
['banana', 'cherry', 'cherry']
);
Result:
[cherry, banana]
So in this case, cherry
is returned once even though it appears twice in each list.
Only One Match
If there’s only one matching element, it’s still returned in a list:
SELECT list_intersect(
['apple', 'banana', 'cherry'],
['banana', 'kiwi', 'date']
);
Result:
[banana]
No Match
If there’s no matching element, an empty list is returned:
SELECT list_intersect(
['apple', 'banana', 'cherry'],
['persimmon', 'kiwi', 'date']
);
Result:
[]
NULL Values
NULL values aren’t counted as matches:
SELECT list_intersect(
['apple', 'banana', NULL, 'cherry'],
['persimmon', NULL, 'date']
);
Result:
[]
NULL Lists
If both lists are NULL
then NULL
is returned.
SELECT list_intersect( NULL, NULL );
Result:
null
Interestingly, if the first list is NULL
, then NULL
is returned, but if the second list is NULL
, an empty list is returned:
SELECT
list_intersect( NULL, ['banana', 'cherry'] ) AS first_list_null,
list_intersect( ['apple', 'cherry'], NULL ) AS second_list_null;
Result:
+-----------------+------------------+
| first_list_null | second_list_null |
+-----------------+------------------+
| null | [] |
+-----------------+------------------+
Nested Lists
If one of the lists contains a nested list but the other doesn’t, then an error is returned:
SELECT list_intersect(
['apple', 'banana', 'cherry'],
[['banana', 'cherry', 'date']]
);
Result:
Binder Error: list_contains: Cannot match element of type 'VARCHAR' in a list of type 'VARCHAR[][]' - an explicit cast is required
But if both lists contain nested lists, then that’s fine:
SELECT list_intersect(
[['apple', 'banana', 'cherry']],
[['banana', 'cherry', 'date']]
);
Result:
[]
However, in this case, it will be comparing the nested lists with each other. That’s why this example returned an empty list.
Here’s an example of where there’s a match:
SELECT list_intersect(
[['apple', 'banana'], ['cherry', 'kiwi']],
[['banana', 'apple'], ['cherry', 'kiwi']]
);
Result:
[[cherry, kiwi]]
This results in the matching nested list being returned as a nested list. This is in line with the other examples – the list_intersect()
function returns a list. So if the matching elements are lists themselves, then they will be nested lists in the result.
Flattening Nested Lists
It’s also possible to flatten lists that contain other lists:
SELECT
list_intersect(
[['apple', 'banana', 'cherry']],
[['banana', 'cherry', 'date']]
) AS not_flattened,
list_intersect(
flatten([['apple', 'banana', 'cherry']]),
flatten([['banana', 'cherry', 'date']])
) AS flattened;
Result:
+---------------+------------------+
| not_flattened | flattened |
+---------------+------------------+
| [] | [cherry, banana] |
+---------------+------------------+
While we’re at it, let’s flatten the list from the other example:
SELECT
list_intersect(
[['apple', 'banana'], ['cherry', 'kiwi']],
[['banana', 'apple'], ['cherry', 'kiwi']]
) AS not_flattened,
list_intersect(
flatten([['apple', 'banana'], ['cherry', 'kiwi']]),
flatten([['banana', 'apple'], ['cherry', 'kiwi']])
) AS flattened;
Result:
+------------------+-------------------------------+
| not_flattened | flattened |
+------------------+-------------------------------+
| [[cherry, kiwi]] | [kiwi, cherry, banana, apple] |
+------------------+-------------------------------+
An Alias: array_intersect()
DuckDB also has an array_intersect()
function, which is an alias for list_intersect()
. So we can use either one interchangeably:
SELECT
list_intersect(
['apple', 'banana', 'cherry'],
['banana', 'cherry', 'date']
) AS list_intersect,
array_intersect(
['apple', 'banana', 'cherry'],
['banana', 'cherry', 'date']
) AS array_intersect;
Result:
+------------------+------------------+
| list_intersect | array_intersect |
+------------------+------------------+
| [cherry, banana] | [cherry, banana] |
+------------------+------------------+