Using LIST_RESIZE() to Resize a List in DuckDB

DuckDB provides us with a list_resize() function for those occasions where we need to resize a list. It also offers an alias called array_resize() that does exactly the same thing. When we resize a list using these functions, we specify how big we want the resulting list to be, and we can also specify a value to use for any extra elements that are added to the list.

Read more

Detect Infinite Values in DuckDB with the ISINF() Function

In DuckDB, we can use the ISINF() function to check whether a value is finite. DuckDB supports infinite values, and so this function allows us to check for that. This can be useful when working with floating-point data that might contain special values like NaN (Not a Number) or infinity. The function also works on date and timestamp values, as they can be infinite too.

Read more

Understanding DuckDB’s ARG_MIN_NULL() Function

DuckDB has a arg_min_null() function that works in a similar way to the arg_min() function. That is, it finds the row with the minimum value in one column and returns the corresponding value from another column at that row.

But there’s also a difference between these two functions. The main difference is in the way they deal with NULL values. Also, arg_min_null() only accepts two arguments, whereas arg_min() accepts an optional third argument. Additionally, there aren’t any aliases for arg_min_null() at the time of writing (arg_min() has a couple of aliases).

In this article we’ll look at how arg_min_null() works, and we’ll compare it with arg_min() to see how each function handles NULL values.

Read more

Convert Column Values to a JSON Array with DuckDB’s JSON_GROUP_ARRAY() Function

The json_group_array() function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg() or the group_concat() function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.

This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.

Read more

3 Ways to Get the Weighted Average in DuckDB

Weighted averages are common calculations in data analysis, allowing us to assign different levels of importance to individual values in our dataset. Unlike simple averages, where each value has equal impact, weighted averages let us incorporate the relative significance of each observation. This is particularly valuable for scenarios like calculating GPA (where courses have different credit weights), investment portfolio returns (where assets have varying allocations), or quality ratings (where reviewers have different expertise levels).

In this article, we’ll explore three ways of calculating weighted averages in DuckDB.

Read more

Extract All Values From a JSON Document With DuckDB’s JSON_TRANSFORM() Function

The json_transform() function in DuckDB is a handy tool for converting JSON strings into structured data types like STRUCT, MAP, and LIST. This allows you to directly query and manipulate nested JSON data using standard SQL, making it much easier to work with complex JSON objects and arrays.

Think of it as a way to cast your JSON data into a more usable, typed format within your database.

Read more