Convert Column Values to a JSON Array with DuckDB’s JSON_GROUP_ARRAY() Function

The json_group_array() function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg() or the group_concat() function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.

This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.

For example, if we have a table of orders where each row represents an item in an order, we could use json_group_array() to collect all the items for each order into a single JSON array. This way, we can produce a compact and structured JSON representation of an order, with all its items embedded as an array.

Combined with other JSON functions like json_object(), it allows for powerful nesting and transformation of relational data into JSON documents—enabling use cases like API responses, data warehousing, and more.

Example

Let’s go ahead with that idea and create a table called orders where each row represents an item in a customer order:

CREATE TABLE orders (
    order_id INTEGER,
    customer_name TEXT,
    item_name TEXT,
    quantity INTEGER,
    price DOUBLE
);
INSERT INTO orders VALUES
(1, 'Amy', 'Apple', 2, 0.5),
(1, 'Amy', 'Banana', 5, 0.2),
(2, 'Sergey', 'Orange', 3, 0.6),
(2, 'Sergey', 'Grapes', 1, 2.0),
(3, 'Ernesto', 'Mango', 4, 1.5);

That created the orders table and populated it.

Basic Query

Now let’s run a query that uses json_group_array():

SELECT json_group_array(item_name) AS all_items
FROM orders;

Result:

+----------------------------------------------+
| all_items |
+----------------------------------------------+
| ["Apple","Banana","Orange","Grapes","Mango"] |
+----------------------------------------------+

This query outputs all items in the table into a single JSON array.

While doing this is fine, we might prefer to group the items by order. We’ll do that in the next example.

Grouping by Order

Here, we go a step further and group the items by the order_id column:

SELECT
    order_id,
    json_group_array(item_name) AS items
FROM orders
GROUP BY order_id;

Result:

+----------+---------------------+
| order_id | items |
+----------+---------------------+
| 1 | ["Apple","Banana"] |
| 2 | ["Orange","Grapes"] |
| 3 | ["Mango"] |
+----------+---------------------+

This returned all the items for each order grouped as a JSON array.

Group Entire Item Objects Per Order

We can combine json_group_array() with json_object() to create a JSON object for each item, then group them:

SELECT
    order_id,
    json_group_array(
        json_object('item', item_name, 'quantity', quantity, 'price', price)
        ) AS items
FROM orders
GROUP BY order_id;

Result:

+----------+-----------------------------------------------------------------------------------------+
| order_id | items |
+----------+-----------------------------------------------------------------------------------------+
| 1 | [{"item":"Apple","quantity":2,"price":0.5},{"item":"Banana","quantity":5,"price":0.2}] |
| 2 | [{"item":"Orange","quantity":3,"price":0.6},{"item":"Grapes","quantity":1,"price":2.0}] |
| 3 | [{"item":"Mango","quantity":4,"price":1.5}] |
+----------+-----------------------------------------------------------------------------------------+

Summary

  • json_group_array() collects values into a JSON array.
  • It can be useful when combined with json_object() to create nested, structured JSON.
  • Great for transforming tabular data into JSON documents in DuckDB.