4 DuckDB Functions that Find the Position of a Value in a List

As with most other database management systems (DBMSs), DuckDB provides us with a way of finding the position of an element in a list. This is the index at which the specified element occurs in the list.

Actually, DuckDB provides us with four synonymous functions for doing this (i.e. they all do the same thing).

The Functions

DuckDB provides the following functions for getting the index of an element from a list:

  • list_position()
  • list_indexof()
  • array_position()
  • array_indexof()

The syntax goes like this:

list_position(list, element)
list_indexof(list, element)
array_position(list, element)
array_indexof(list, element)

Example

Here’s a quick example to demonstrate the list_position() function:

SELECT list_position(['apple', 'banana', 'cherry'], 'banana');

Result:

2

In DuckDB, lists start at 1, and so in this example, banana is at position 2.

Here’s an example that shows all four functions together:

SET VARIABLE fruit = ['apple', 'banana', 'cherry', 'kiwi'];

SELECT 
    list_position(getvariable('fruit'), 'kiwi') AS list_position,
    list_indexof(getvariable('fruit'), 'kiwi') AS list_indexof,
    array_position(getvariable('fruit'), 'kiwi') AS array_position,
    array_indexof(getvariable('fruit'), 'kiwi') AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| 4 | 4 | 4 | 4 |
+---------------+--------------+----------------+---------------+

Multiple Occurrences

If the list contains more than one instance of the specified value, then the functions return the first occurrence:

SET VARIABLE fruit = ['apple', 'kiwi', 'cherry', 'kiwi'];

SELECT 
    list_position(getvariable('fruit'), 'kiwi') AS list_position,
    list_indexof(getvariable('fruit'), 'kiwi') AS list_indexof,
    array_position(getvariable('fruit'), 'kiwi') AS array_position,
    array_indexof(getvariable('fruit'), 'kiwi') AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| 2 | 2 | 2 | 2 |
+---------------+--------------+----------------+---------------+

When the List Doesn’t Contain the Value

If the list doesn’t contain the specified value, then the functions return NULL:

.nullvalue 'null'
SET VARIABLE fruit = ['apple', 'orange', 'cherry', 'banana'];

SELECT 
    list_position(getvariable('fruit'), 'kiwi') AS list_position,
    list_indexof(getvariable('fruit'), 'kiwi') AS list_indexof,
    array_position(getvariable('fruit'), 'kiwi') AS array_position,
    array_indexof(getvariable('fruit'), 'kiwi') AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| null | null | null | null |
+---------------+--------------+----------------+---------------+

Here I used .nullvalue 'null' in order to make my DuckDB CLI output null whenever a NULL value is encountered. By default it returns an empty string, which makes it harder to distinguish NULL values from an actual empty string.

Finding Empty Strings

Speaking of empty strings, we can search for those too:

SET VARIABLE fruit = ['apple', 'kiwi', '', 'banana'];

SELECT 
    list_position(getvariable('fruit'), '') AS list_position,
    list_indexof(getvariable('fruit'), '') AS list_indexof,
    array_position(getvariable('fruit'), '') AS array_position,
    array_indexof(getvariable('fruit'), '') AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| 3 | 3 | 3 | 3 |
+---------------+--------------+----------------+---------------+

Searching for NULL Values

We can’t say the same for NULL values. If we search for a NULL value, a NULL value will be returned, even if the list contains a NULL value:

SET VARIABLE fruit = ['apple', 'kiwi', NULL, 'banana'];

SELECT 
    list_position(getvariable('fruit'), NULL) AS list_position,
    list_indexof(getvariable('fruit'), NULL) AS list_indexof,
    array_position(getvariable('fruit'), NULL) AS array_position,
    array_indexof(getvariable('fruit'), NULL) AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| null | null | null | null |
+---------------+--------------+----------------+---------------+

A NULL value is the absence of any value, so searching for NULL is like searching for an unknown value. Actually, according to the DuckDB documentation:

Any comparison with a NULL value returns NULL, including NULL = NULL

But if we don’t search for a NULL value, the functions will simply skip any NULL values in the list:

SET VARIABLE fruit = ['apple', 'kiwi', NULL, 'banana'];

SELECT 
    list_position(getvariable('fruit'), 'kiwi') AS list_position,
    list_indexof(getvariable('fruit'), 'kiwi') AS list_indexof,
    array_position(getvariable('fruit'), 'kiwi') AS array_position,
    array_indexof(getvariable('fruit'), 'kiwi') AS array_indexof;

Result:

+---------------+--------------+----------------+---------------+
| list_position | list_indexof | array_position | array_indexof |
+---------------+--------------+----------------+---------------+
| 2 | 2 | 2 | 2 |
+---------------+--------------+----------------+---------------+