When working with DuckDB, we may need to determine whether a specific value exists within a list or array. Fortunately, DuckDB offers four synonymous functions that enable us to accomplish this.
The Functions
DuckDB provides the following four functions for checking for the existence of a value in a list:
list_contains()
list_has()
array_contains()
array_has()
These are all synonymous, and return true if the given list contains the element.
The syntax goes like this:
list_contains(list, element)
Where element
is the value that we’re testing the existence of in the list
. The alias functions all accept the same arguments and work the same way.
Example
To start simple, here’s the list_contains()
function:
SELECT list_contains(['Cat', 'Dog', 'Bird'], 'Cat');
Result:
true
It returned true because the list does in fact contain Cat
, which is the value I was testing for.
Let’s try to find a value that doesn’t exist in the list:
SELECT list_contains(['Cat', 'Dog', 'Bird'], 'Horse');
Result:
false
As expected, it returned false
.
Example with All Functions
Here it is again, along with its aliases:
SET VARIABLE animals = ['Cat', 'Dog', 'Bird'];
SELECT
list_contains(getvariable('animals'), 'Cat') AS list_contains,
list_has(getvariable('animals'), 'Cat') AS list_has,
array_contains(getvariable('animals'), 'Cat') AS array_contains,
array_has(getvariable('animals'), 'Cat') AS array_has;
Result:
+---------------+----------+----------------+-----------+
| list_contains | list_has | array_contains | array_has |
+---------------+----------+----------------+-----------+
| true | true | true | true |
+---------------+----------+----------------+-----------+
As expected, they all returned true
.
Nested Lists
We can check for a list within a list:
SELECT list_contains([
['Dog', 'Wolf'],
['Cat', 'Bird']
],
['Cat', 'Bird']
);
Output:
true
But we can’t just provide a string and expect it to search through nested lists:
SELECT list_contains([
['Dog', 'Wolf'],
['Cat', 'Bird']
],
'Cat'
);
Output:
Binder Error: list_contains: Cannot match element of type 'VARCHAR' in a list of type 'VARCHAR[][]' - an explicit cast is required
However, there’s nothing to stop us from flattening the list:
SELECT
list_contains(
flatten([
['Dog', 'Wolf'],
['Cat', 'Bird']
]),
'Cat'
);
Output:
true
The flatten()
function concatenates a list of lists into a single list. So by using this technique, we’re creating one big list, then searching through it. To illustrate what I mean, here’s the output of the flatten()
function by itself:
SELECT flatten([
['Dog', 'Wolf'],
['Cat', 'Bird']
]);
Output:
[Dog, Wolf, Cat, Bird]
So that’s the list that the previous example searched through.
NULL Values
The function works fine if the list contains NULL values:
SELECT list_contains(['Cat', NULL, 'Dog'], 'Cat');
Output:
true
But if our second argument is NULL
, then the function returns NULL
:
.nullvalue 'null'
SELECT list_contains(['Cat', NULL, 'Dog'], NULL);
Output:
null
This is true, even if the list doesn’t contain any NULL values:
SELECT list_contains(['Cat', 'Dog'], NULL);
Output:
null
In the above example I used .nullvalue 'null'
to set my DuckDB CLI to return null whenever there’s a NULL value. The default is an empty string, and so setting it to null
helps show that a NULL value was in fact returned.