DuckDB has a flatten()
function that we can use to flatten nested lists. The function concatenates a list of lists into a single list. So whether the outer list contains just one list or multiple lists, we can use the flatten()
function to flatten them into one list.
However, it only goes one level deep, so that’s something to keep in mind.
Example
Here’s a basic example to illustrate how it works:
SELECT flatten([
['Dog', 'Cat'],
['Tree', 'Flower'],
['Red', 'Blue']
]);
Result:
[Dog, Cat, Tree, Flower, Red, Blue]
So it basically combined everything into one big list.
Single Lists
If the outer list only contains one list, then it will have the effect of returning just the inner list:
SELECT flatten([
['Dog', 'Cat']
]);
Result:
[Dog, Cat]
Kinda like extracting the inner list from the outer list.
Multiple Levels
As alluded to, the flatten()
function only flattens one level deep. If a list contains more than one level of lists, then only the first level is flattened:
SELECT flatten([
[
['Dog', 'Cat'],
['Tree', 'Flower']
],
[
['Red', 'Blue'],
['Yellow', 'Green']
]
]);
Result:
[[Dog, Cat], [Tree, Flower], [Red, Blue], [Yellow, Green]]
When the List is Already Flat
Passing a single level list results in an error:
SELECT flatten(
['Dog', 'Cat']
);
Result:
Binder Error: No function matches the given name and argument types 'flatten(VARCHAR[])'. You might need to add explicit type casts.
Candidate functions:
flatten(ANY[][]) -> ANY[]
LINE 1: SELECT flatten(
^
Passing Empty Lists
Passing an empty list returns an empty list:
SELECT flatten([]);
Result:
[]
Passing NULL Values
Passing NULL
results in NULL
:
.nullvalues 'null'
SELECT flatten(NULL);
Result:
null
And if we pass a list that only contains NULL
, then an empty list is returned:
SELECT flatten([NULL]);
Result:
[]
However, if there’s an inner list with nothing but NULL, then that will be returned:
SELECT flatten([[NULL]]);
Result:
[NULL]
If there are multiple inner lists that contain nothing but NULL values, then those NULL values are included in the final list:
SELECT flatten([[NULL],[NULL],[NULL,NULL]]);
Result:
[NULL, NULL, NULL, NULL]
Deduping the Result
The flatten()
function doesn’t remove duplicates. So if multiple sub-lists contain the same value, then that value is included multiple times in the resulting list. However, if this isn’t desirable, you can use list_distinct()
to remove duplicates from the result:
SELECT list_distinct(
flatten([
['Dog', 'Cat'],
['Cat', 'Bird']
])
);
Result:
[Cat, Bird, Dog]
One thing to bear in mind is that list_distinct()
doesn’t preserve the original order.
There’s also an array_distinct()
function, which is an alias for list_distinct()
. In other words, you can use either function to get the same result.