The Difference Between LIST_SELECT() and LIST_SLICE() in DuckDB

DuckDB has a list_select() function and a list_slice() function, and both do a similar thing. They allow us to extract values from lists based on their index in the list. But they’re quite different in the way they work. One function allows us to select elements based on a range, while the other function allows us to handpick each element we want returned.

The Difference

Here’s how each function works:

  • list_select(): Accepts two lists. The second list contains the indexes of the elements that we want to extract from the first list.
  • list_slice(): Allows us to specify a range for which to extract items from the specified list. All elements in that range are returned. We can also specify a “step” value, which can be used to select only some elements within the specified range.

Example

Here’s an example that illustrates the difference between the two functions:

SELECT 
list_select(['Sky', 'Box', 'Run', 'Ant', 'Cam'], [2, 4]) AS list_select,
list_slice(['Sky', 'Box', 'Run', 'Ant', 'Cam'], 2, 4) AS list_slice;

Output:

+-------------+-----------------+
| list_select | list_slice |
+-------------+-----------------+
| [Box, Ant] | [Box, Run, Ant] |
+-------------+-----------------+

We can see that list_slice() returned all elements from the specified range, whereas list_select() returns just the specified list items themselves.

When Both Functions Return the Same Result

It’s possible to get the same result from both functions, depending on the arguments we pass. For example, we could do the following:

SELECT 
    list_select(['Sky', 'Box', 'Run', 'Ant', 'Cam'], [2, 3, 4]) AS list_select,
    list_slice(['Sky', 'Box', 'Run', 'Ant', 'Cam'], 2, 4) AS list_slice;

Output:

+-----------------+-----------------+
| list_select | list_slice |
+-----------------+-----------------+
| [Box, Run, Ant] | [Box, Run, Ant] |
+-----------------+-----------------+

Here, the list_select() function’s second argument contained the list indexes for the whole range that list_slice() returns. But this required us to explicitly specify each element within the range, which could become quite unwieldy if we need to work with a large range.

Another thing we could do is modify the list_slice() function’s arguments to return the results that we originally got with list_select():

SELECT 
    list_select(['Sky', 'Box', 'Run', 'Ant', 'Cam'], [2, 4]) AS list_select,
    list_slice(['Sky', 'Box', 'Run', 'Ant', 'Cam'], 2, 4, 2) AS list_slice;

Output:

+-------------+------------+
| list_select | list_slice |
+-------------+------------+
| [Box, Ant] | [Box, Ant] |
+-------------+------------+

In this case I passed a step value to list_slice(), which caused it to skip a value.

This technique will only go so far if we’re trying to mimic the results of list_select(). For example, the list_select() function can be useful in the following scenario:

SELECT list_select(
    ['Sky', 'Box', 'Run', 'Ant', 'Cam', 'Sit', 'Zip', 'Cat', 'Dog', 'Lit'], 
    [1, 4, 5, 10]
    );

Result:

[Sky, Ant, Cam, Lit]

The list_slice() function wouldn’t quite cut it in this situation.

Summary

So in summary:

  • Use list_select() to pick arbitrary list elements.
  • Use list_slice() to get all elements within a certain range.