4 Options for Extracting the First Value from a List in DuckDB

When working with lists in DuckDB, sometimes you might want to return just the first value, rather than the whole list. Fortunately, DuckDB provides us with several ways to do this.

1. The list_first() Function

The list_first() function does what its name suggests it might do; it returns the first value from a given list. To use this function, we simply pass the list. No other arguments are required or accepted.

Example:

SELECT list_first(['Cat', 'Dog', 'Bird']);

Result:

Cat

If the first value is a list, then that nested list is returned:

SELECT list_first([['Cat', 'Dog'], ['Bird', 'Frog']]);

Result:

[Cat, Dog]

We can also use the unnest() function to unnest the list. When we do this, we might get a different result, depending on where we put the unnest() function.

For example:

SELECT list_first([
    unnest(['Cat', 'Dog']), 
    unnest(['Bird', 'Frog'])
]) AS unnested;

Output:

+----------+
| unnested |
+----------+
| Cat |
| Dog |
+----------+

Here’s what happens when we apply unnest() to the outer list:

SELECT list_first(
    unnest([
        ['Cat', 'Dog'], 
        ['Bird', 'Frog']
        ])
) AS unnested;

Output:

+----------+
| unnested |
+----------+
| Cat |
| Bird |
+----------+

And here it is with the whole list_first() operation enclosed in an unnest():

SELECT unnest(
    list_first([
            ['Cat', 'Dog'], 
            ['Bird', 'Frog']
        ])
    ) AS unnested;

Output:

+----------+
| unnested |
+----------+
| Cat |
| Dog |
+----------+

We can also use the flatten() function to flatten the nested list:

SELECT list_first(
    flatten([['Cat', 'Dog'], ['Bird', 'Frog']])
    ) AS flattened;

Output:

+-----------+
| flattened |
+-----------+
| Cat |
+-----------+

2. The list_aggregate() Function

The list_aggregate() function is a more general function that allows us to specify which aggregate function to apply. If we pass first as its second argument, we’ll get the same result as list_first():

SELECT list_aggregate(['Cat', 'Dog', 'Bird'], 'first');

Result:

Cat

We can also use unnest() or flatten() when working with nested lists.

3. The first() Function

The first() function is an aggregate function that gets the first value of a column. While it’s not designed to be used on lists, it is possible to use it on a list that’s been unnested:

SELECT first(l)
FROM unnest(['Cat', 'Dog', 'Bird']) AS t(l);

Result:

Cat

To understand this example, here it is without the first() function:

SELECT l
FROM unnest(['Cat', 'Dog', 'Bird']) AS t(l);

Result:

+------+
| l |
+------+
| Cat |
| Dog |
| Bird |
+------+

The first row contains Cat, and that’s what the first() function returned. It simply chose whatever value was in the first row.

4. The LIMIT Clause

We can modify the previous query so that it returns just the first row. To do this, we can use the LIMIT clause (specifically LIMIT 1) to limit the results to just the first row:

SELECT l
FROM unnest(['Cat', 'Dog', 'Bird']) AS t(l)
LIMIT 1;

Output:

+-----+
| l |
+-----+
| Cat |
+-----+

One benefit of this method is that we can use it to return more than one row if required. For example, we can get the first two rows:

SELECT l
FROM unnest(['Cat', 'Dog', 'Bird']) AS t(l)
LIMIT 2;

Output:

+-----+
| l |
+-----+
| Cat |
| Dog |
+-----+