When working with DuckDB, we may need to determine whether a specific value exists within a list or array. Fortunately, DuckDB offers four synonymous functions that enable us to accomplish this.
The Functions
DuckDB provides the following four functions for checking for the existence of a value in a list:
list_contains()list_has()array_contains()array_has()
These are all synonymous, and return true if the given list contains the element.
The syntax goes like this:
list_contains(list, element)
Where element is the value that we’re testing the existence of in the list. The alias functions all accept the same arguments and work the same way.
Example
To start simple, here’s the list_contains() function:
SELECT list_contains(['Cat', 'Dog', 'Bird'], 'Cat');
Result:
true
It returned true because the list does in fact contain Cat, which is the value I was testing for.
Let’s try to find a value that doesn’t exist in the list:
SELECT list_contains(['Cat', 'Dog', 'Bird'], 'Horse');
Result:
false
As expected, it returned false.
Example with All Functions
Here it is again, along with its aliases:
SET VARIABLE animals = ['Cat', 'Dog', 'Bird'];
SELECT
list_contains(getvariable('animals'), 'Cat') AS list_contains,
list_has(getvariable('animals'), 'Cat') AS list_has,
array_contains(getvariable('animals'), 'Cat') AS array_contains,
array_has(getvariable('animals'), 'Cat') AS array_has;
Result:
+---------------+----------+----------------+-----------+
| list_contains | list_has | array_contains | array_has |
+---------------+----------+----------------+-----------+
| true | true | true | true |
+---------------+----------+----------------+-----------+
As expected, they all returned true.
Nested Lists
We can check for a list within a list:
SELECT list_contains([
['Dog', 'Wolf'],
['Cat', 'Bird']
],
['Cat', 'Bird']
);
Output:
true
But we can’t just provide a string and expect it to search through nested lists:
SELECT list_contains([
['Dog', 'Wolf'],
['Cat', 'Bird']
],
'Cat'
);
Output:
Binder Error: list_contains: Cannot match element of type 'VARCHAR' in a list of type 'VARCHAR[][]' - an explicit cast is required
However, there’s nothing to stop us from flattening the list:
SELECT
list_contains(
flatten([
['Dog', 'Wolf'],
['Cat', 'Bird']
]),
'Cat'
);
Output:
true
The flatten() function concatenates a list of lists into a single list. So by using this technique, we’re creating one big list, then searching through it. To illustrate what I mean, here’s the output of the flatten() function by itself:
SELECT flatten([
['Dog', 'Wolf'],
['Cat', 'Bird']
]);
Output:
[Dog, Wolf, Cat, Bird]
So that’s the list that the previous example searched through.
NULL Values
The function works fine if the list contains NULL values:
SELECT list_contains(['Cat', NULL, 'Dog'], 'Cat');
Output:
true
But if our second argument is NULL, then the function returns NULL:
.nullvalue 'null'
SELECT list_contains(['Cat', NULL, 'Dog'], NULL);
Output:
null
This is true, even if the list doesn’t contain any NULL values:
SELECT list_contains(['Cat', 'Dog'], NULL);
Output:
null
In the above example I used .nullvalue 'null' to set my DuckDB CLI to return null whenever there’s a NULL value. The default is an empty string, and so setting it to null helps show that a NULL value was in fact returned.