Understanding the JSON_GROUP_STRUCTURE() Function in DuckDB

The json_group_structure() function in DuckDB is an aggregate function that inspects all JSON values within a group and returns a JSON representation of their structure. It essentially infers a “schema” for the JSON objects in that group. This can be useful for understanding the shape and consistency of your JSON data.

Continue reading

Using DuckDB’s FSUM() Function for More Accurate Results

DuckDB has a fsum() function that can be used instead of the regular sum() function in order to get more accurate results. fsum() calculates the sum using a floating point summation method known as Kahan summation (or compensated summation).

This method helps reduce the accumulation of rounding errors that can occur when summing many floating point numbers when using the regular sum() function.

Continue reading

A Quick Look at DuckDB’s WEIGHTED_AVG() Function

In analytical SQL workloads, expressing weighted averages can sometimes involve verbose expressions such as combining the sum() function with other operators. DuckDB streamlines this with its native weighted_avg() aggregate function, allowing us to compute weighted averages directly and efficiently. The weighted_avg() function enhances both clarity and speed when dealing with data where values contribute unequally — such as population-adjusted metrics or revenue-weighted scores.

This article explores the weighted_avg() in DuckDB, along with examples to demonstrate its usage.

Continue reading

How ARG_MAX_NULL() Works in DuckDB

In DuckDB, the arg_max_null() function works in a similar way to the arg_max() function, in that it finds the row with the maximum value in one column and returns the corresponding value from another column at that row.

But where it differs from arg_max() is in the way it deals with NULL values. Also, arg_max_null() only accepts two arguments, whereas arg_max() accepts an optional third argument. Additionally, there aren’t any aliases for arg_max_null() at the time of writing (arg_max() has a couple of aliases).

In this article we’ll look at how arg_max_null() works, and we’ll compare it with arg_max() to see how each function handles NULL values.

Continue reading