The json_group_array()
function in DuckDB is used to aggregate values into a JSON array, making it especially useful when working with structured or semi-structured data such as JSON. This function is part of DuckDB’s JSON extension and works similarly to DuckDB’s string_agg()
or the group_concat()
function in some other RDBMSs, but instead of returning a delimited string, it returns a well-formed JSON array.
This function is particularly helpful when we need to represent grouped or hierarchical data in a JSON format for export, reporting, or further transformation.
For example, if we have a table of orders where each row represents an item in an order, we could use json_group_array()
to collect all the items for each order into a single JSON array. This way, we can produce a compact and structured JSON representation of an order, with all its items embedded as an array.
Combined with other JSON functions like json_object()
, it allows for powerful nesting and transformation of relational data into JSON documents—enabling use cases like API responses, data warehousing, and more.
Example
Let’s go ahead with that idea and create a table called orders
where each row represents an item in a customer order:
CREATE TABLE orders (
order_id INTEGER,
customer_name TEXT,
item_name TEXT,
quantity INTEGER,
price DOUBLE
);
INSERT INTO orders VALUES
(1, 'Amy', 'Apple', 2, 0.5),
(1, 'Amy', 'Banana', 5, 0.2),
(2, 'Sergey', 'Orange', 3, 0.6),
(2, 'Sergey', 'Grapes', 1, 2.0),
(3, 'Ernesto', 'Mango', 4, 1.5);
That created the orders
table and populated it.
Basic Query
Now let’s run a query that uses json_group_array()
:
SELECT json_group_array(item_name) AS all_items
FROM orders;
Result:
+----------------------------------------------+
| all_items |
+----------------------------------------------+
| ["Apple","Banana","Orange","Grapes","Mango"] |
+----------------------------------------------+
This query outputs all items in the table into a single JSON array.
While doing this is fine, we might prefer to group the items by order. We’ll do that in the next example.
Grouping by Order
Here, we go a step further and group the items by the order_id
column:
SELECT
order_id,
json_group_array(item_name) AS items
FROM orders
GROUP BY order_id;
Result:
+----------+---------------------+
| order_id | items |
+----------+---------------------+
| 1 | ["Apple","Banana"] |
| 2 | ["Orange","Grapes"] |
| 3 | ["Mango"] |
+----------+---------------------+
This returned all the items for each order grouped as a JSON array.
Group Entire Item Objects Per Order
We can combine json_group_array()
with json_object()
to create a JSON object for each item, then group them:
SELECT
order_id,
json_group_array(
json_object('item', item_name, 'quantity', quantity, 'price', price)
) AS items
FROM orders
GROUP BY order_id;
Result:
+----------+-----------------------------------------------------------------------------------------+
| order_id | items |
+----------+-----------------------------------------------------------------------------------------+
| 1 | [{"item":"Apple","quantity":2,"price":0.5},{"item":"Banana","quantity":5,"price":0.2}] |
| 2 | [{"item":"Orange","quantity":3,"price":0.6},{"item":"Grapes","quantity":1,"price":2.0}] |
| 3 | [{"item":"Mango","quantity":4,"price":1.5}] |
+----------+-----------------------------------------------------------------------------------------+
Summary
json_group_array()
collects values into a JSON array.- It can be useful when combined with
json_object()
to create nested, structured JSON. - Great for transforming tabular data into JSON documents in DuckDB.