Convert Column Values to a JSON Array with DuckDB’s JSON_GROUP_ARRAY() Function

The json_group_array() function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg() or the group_concat() function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.

This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.

Continue reading

Extract All Values From a JSON Document With DuckDB’s JSON_TRANSFORM() Function

The json_transform() function in DuckDB is a handy tool for converting JSON strings into structured data types like STRUCT, MAP, and LIST. This allows you to directly query and manipulate nested JSON data using standard SQL, making it much easier to work with complex JSON objects and arrays.

Think of it as a way to cast your JSON data into a more usable, typed format within your database.

Continue reading

Performance Tip for Extracting Multiple Values from JSON in DuckDB

DuckDB has a bunch of functions that allow us to extract data from JSON documents. For example, there’s the json_extract() function, which extracts JSON from the specified JSON document.

Often times we’ll need to extract multiple values within the same query. For example, we may need to extract both a user’s name and age, so that they’re returned in two separate columns.

Continue reading

Get the Number of Elements in a JSON Array with JSON_ARRAY_LENGTH() in DuckDB

In DuckDB, the json_array_length() function is used to determine the number of elements in a JSON array. This function can return the count of items in the top-level array or in a nested array within it. It’s especially useful when working with complex JSON structures where arrays are embedded inside objects or other arrays. By combining it with JSON path expressions, you can target specific portions of the data for more precise analysis.

Continue reading

Understanding the JSON_GROUP_STRUCTURE() Function in DuckDB

The json_group_structure() function in DuckDB is an aggregate function that inspects all JSON values within a group and returns a JSON representation of their structure. It essentially infers a “schema” for the JSON objects in that group. This can be useful for understanding the shape and consistency of your JSON data.

Continue reading