The generate_subscripts()
function in DuckDB creates a series of index values for accessing elements in an array. It can help us iterate through array indices when we need to work with array elements.
Here’s a quick look at DuckDB’s generate_subscripts()
function, along with examples.
What is generate_subscripts()
?
generate_subscripts()
is a table-generating function that produces a series of integer values representing valid indices for a specific array dimension.
The basic syntax is straightforward:
generate_subscripts(array, dimension)
Where:
array
is any array expressiondimension
specifies which dimension of the array to generate indices for (1
for one-dimensional arrays).
Example
Here’s a basic example to demonstrate how it works:
SELECT generate_subscripts(
array_value('Sun', 'Sky', 'Sea'),
1
);
Result:
+----------------------------------------------------------+
| generate_subscripts(array_value('Sun', 'Sky', 'Sea'), 1) |
+----------------------------------------------------------+
| 1 |
| 2 |
| 3 |
+----------------------------------------------------------+
So we can see that one row is returned for each array element, and the value of each row increments with each element.
Specifying Another Dimension
Here’s an example that uses a two-dimensional array. First, let’s get the subscripts for the first dimension:
SELECT generate_subscripts(
array_value(
array_value('Sun', 'Sky', 'Sea'),
array_value('Tag', 'Rub', 'Tub')
),
1
) AS first_dimension;
Result:
+-----------------+
| first_dimension |
+-----------------+
| 1 |
| 2 |
+-----------------+
Now let’s do the second dimension. To do this, all we need to do is change 1
to 2
:
SELECT generate_subscripts(
array_value(
array_value('Sun', 'Sky', 'Sea'),
array_value('Tag', 'Rub', 'Tub')
),
2
) AS second_dimension;
Result:
+------------------+
| second_dimension |
+------------------+
| 1 |
| 2 |
| 3 |
+------------------+
Extracting the Array’s Elements
We can modify our query in order to extract the array’s elements at each corresponding subscript:
SELECT
generate_subscripts(a, 1) AS subscript,
a[generate_subscripts(a, 1)] AS value
FROM
(SELECT array_value('Sky','Sun','Sea','Tag') AS a) t;
Result:
+-----------+-------+
| subscript | value |
+-----------+-------+
| 1 | Sky |
| 2 | Sun |
| 3 | Sea |
| 4 | Tag |
+-----------+-------+
Common Errors
Non-Existent Dimensions
If we specify a dimension that doesn’t exist in the array, an error is returned:
SELECT generate_subscripts(
array_value('Sun', 'Sky', 'Sea'),
2
);
Result:
Out of Range Error: array_length dimension '2' out of range (min: '1', max: '1')
The same error occurs when we pass a dimension of zero:
SELECT generate_subscripts(
array_value('Sun', 'Sky', 'Sea'),
0
);
Result:
Out of Range Error: array_length dimension '0' out of range (min: '1', max: '1')
Omitting the Dimension
Omitting the dimension results in an error:
SELECT generate_subscripts(
array_value('Sun', 'Sky', 'Sea')
);
Result:
Binder Error: Macro function generate_subscripts(arr, dim) requires 2 positional arguments, but a single positional argument was provided.
LINE 1: SELECT generate_subscripts(
^
Passing a Non-Array
If the first argument is not an array, then an error occurs:
SELECT generate_subscripts(
'Sun',
1
);
Result:
Binder Error: No function matches the given name and argument types 'array_length(STRING_LITERAL, INTEGER_LITERAL)'. You might need to add explicit type casts.
Candidate functions:
array_length(ANY[]) -> BIGINT
array_length(ANY[], BIGINT) -> BIGINT
LINE 2: 'Sun',
^