5 Functions that Concatenate Lists in DuckDB

DuckDB provides us with a bunch of list concatenation functions that do exactly the same thing; concatenate two lists. Actually, they’re all synonyms and so they can all be used interchangeably. There’s also a more general concatenation function that can also be used on lists.

So this article presents five functions that we can use to concatenate lists.

Continue reading

Fix “Could not convert string ‘…’ to INT64” When Using the GENERATE_SUBSCRIPTS() Function in DuckDB

If you’re getting an error that reads something like “Could not convert string ‘…’ to INT64” when using the generate_subscripts() function in DuckDB, it appears that your second argument is a string, when it should be an integer.

DuckDB’s generate_subscripts() function accepts two arguments; the array as the first argument, and the dimension as the second argument. The second argument must be INT64 (or be able to be implicitly converted to that type). Passing the wrong data type as the second argument can cause the above error to occur.

To fix this issue, make sure that the second argument is compatible with INT64.

Continue reading

Understanding EXTRACT() in DuckDB

DuckDB offers a variety of date functions, one of which is the extract() function. This function is designed to retrieve a specific date part from a date or timestamp value, and it can also be applied to intervals.

In this article, we’ll take a closer look at the extract() function and provide some straightforward examples to illustrate how it works.

Continue reading

Adding Quotes Around Field Values When Outputting DuckDB Query Results as a List

In this article we look at how to automatically enclose values in single quotes when outputting DuckDB query results as a list. Using this method, each value is formatted as SQL literals. This can be useful when you want to escape the values for SQL or for some other purpose.

We also look at how CSV output is sometimes quoted with double quotes, and how we can adjust the results by doing things like removing the headers and changing the list separator.

Continue reading

Sampling Rows from a Table in DuckDB with the SAMPLE Clause

DuckDB’s SAMPLE clause is a handy feature that allows us to work with a random subset of our data. This is particularly useful when dealing with large datasets where processing the entire dataset might be time-consuming or unnecessary for exploratory data analysis, testing queries, or creating representative samples.

When we use this clause, we can specify the absolute number of rows to return, or a percentage of rows. We also have an option of sampling method to use.

Continue reading