Understanding DuckDB’s LIST_GRADE_UP() Function

Among DuckDB’s many tools for handling list data is the list_grade_up() function. This function works similarly to a sort operation, but instead of returning the sorted values themselves, it returns the indexes that represent the positions of those values in the original list.

Let’s take a quick look.

Syntax

list_grade_up(ANY[]) -> ANY[]
list_grade_up(ANY[], VARCHAR) -> ANY[]
list_grade_up(ANY[], VARCHAR, VARCHAR) -> ANY[]

The first form is the simplest, and accepts just the list. The second form includes an argument that allows us to determine the sort order. The third form allows us to specify how to sort NULL values.

There’s also an array_grade_up() function that’s an alias of list_grade_up(). So we can use either one to get the same result.

Example

Here’s an example of the simplest form, where we simply pass the list:

SELECT list_grade_up(['apple', 'kiwi', 'banana']);

Result:

[1, 3, 2]

This has sorted the original list, and then returned the indexes of each of those values. So we can see that apple has an index of 1, kiwi is 3, and banana is 2.

If we move the items of the list around, we’ll get a different result:

SELECT list_grade_up(['kiwi', 'banana', 'apple']);

Result:

[3, 2, 1]

Here it is alongside the list_sort() function, so that you can see what the list looks like when it’s actually sorted:

SELECT 
    ['apple', 'kiwi', 'banana'] AS original_list,
    list_sort(['apple', 'kiwi', 'banana']) AS list_sort,
    list_grade_up(['apple', 'kiwi', 'banana']) AS list_grade_up;

Result:

+-----------------------+-----------------------+---------------+
| original_list | list_sort | list_grade_up |
+-----------------------+-----------------------+---------------+
| [apple, kiwi, banana] | [apple, banana, kiwi] | [1, 3, 2] |
+-----------------------+-----------------------+---------------+

Duplicate Values

Here’s what happens if the list contains duplicate values:

SELECT list_grade_up(['apple', 'apple', 'kiwi', 'banana']);

Result:

[1, 2, 4, 3]

As you might expect, it doesn’t return duplicate indexes. The duplicate values still have their own respective list indexes. After all, duplicate values are still separate list items.

Here’s what happens if we move one of the duplicate values:

SELECT list_grade_up(['apple', 'kiwi', 'banana', 'apple']);

Result:

[1, 4, 3, 2]

The result reflects its new position, but it still has the same index.

Ordering

We can pass a second argument to specify how the list can be sorted:

SELECT 
    list_grade_up(['apple', 'kiwi', 'banana', 'apple'], 'ASC') AS ascending,
    list_grade_up(['apple', 'kiwi', 'banana', 'apple'], 'DESC') AS descending;

Result:

+--------------+--------------+
| ascending | descending |
+--------------+--------------+
| [1, 4, 3, 2] | [2, 3, 1, 4] |
+--------------+--------------+

NULL Values

Lists can contain NULL values, and such values still have their own index in the list:

SELECT list_grade_up(['apple', NULL, 'banana']);

Result:

[1, 3, 2]

We can see that the NULL value has an index of 3.

By default, DuckDB sorts NULL values last. However, we can change this if required:

SELECT 
    list_grade_up(['apple', NULL, 'banana'], 'ASC', 'NULLS_FIRST') AS 'ASC, NULLS_FIRST',
    list_grade_up(['apple', NULL, 'banana'], 'ASC', 'NULLS_LAST') AS 'ASC, NULLS_LAST',
    list_grade_up(['apple', NULL, 'banana'], 'DESC', 'NULLS_FIRST') AS 'DESC, NULLS_FIRST',
    list_grade_up(['apple', NULL, 'banana'], 'DESC', 'NULLS_LAST') AS 'DESC, NULLS_LAST';

Result:

+------------------+-----------------+-------------------+------------------+
| ASC, NULLS_FIRST | ASC, NULLS_LAST | DESC, NULLS_FIRST | DESC, NULLS_LAST |
+------------------+-----------------+-------------------+------------------+
| [2, 1, 3] | [1, 3, 2] | [2, 3, 1] | [3, 1, 2] |
+------------------+-----------------+-------------------+------------------+

This has the same effect that we’d see if we’d used the default_order and default_null_order configuration settings. For example:

SET default_order = 'ASC';
SET default_null_order = 'NULLS_FIRST';
SELECT list_grade_up(['apple', NULL, 'banana']);

Result:

[2, 1, 3]

Bear in mind that the configuration settings will be in effect until you change them again.

Nested Lists

We can use list_grade_up() with nested lists:

SELECT list_grade_up([['apple', 'kiwi', 'cherry'], ['banana', 'orange']]);

Result:

[1, 2]

Let’s switch the nested lists around:

SELECT list_grade_up([['banana', 'orange'], ['apple', 'kiwi', 'cherry']]);

Result:

[2, 1]

However, if we unnest the list or flatten it we’ll get a different result.

Flattened Lists

We can use the flatten() function to flatten the list:

SELECT list_grade_up( 
    flatten([ 
        ['apple', 'kiwi', 'cherry'], 
        ['banana', 'orange']
        ]) 
    );

Result:

[1, 4, 3, 2, 5]

The flatten() function concatenates all nested lists into a single list.

Unnested Lists

Here it is with the unnest() function:

SELECT list_grade_up( 
    unnest([ 
        ['apple', 'kiwi', 'cherry'], 
        ['banana', 'orange']
        ]) 
    ) AS unnested;

Result:

+-----------+
| unnested |
+-----------+
| [1, 3, 2] |
| [1, 2] |
+-----------+

The array_grade_up() Function

As mentioned, DuckDB also has an array_grade_up() function, which is synonymous to list_grade_up(). In other words, we can use either one, in the same way, to get the same result. For example:

SELECT 
    list_grade_up(['kiwi', 'banana', 'apple']) AS list_grade_up,
    list_grade_up(['kiwi', 'banana', 'apple']) AS array_grade_up;

Result:

+---------------+----------------+
| list_grade_up | array_grade_up |
+---------------+----------------+
| [3, 2, 1] | [3, 2, 1] |
+---------------+----------------+