3 Ways to Get the Weighted Average in DuckDB

Weighted averages are common calculations in data analysis, allowing us to assign different levels of importance to individual values in our dataset. Unlike simple averages, where each value has equal impact, weighted averages let us incorporate the relative significance of each observation. This is particularly valuable for scenarios like calculating GPA (where courses have different credit weights), investment portfolio returns (where assets have varying allocations), or quality ratings (where reviewers have different expertise levels).

In this article, we’ll explore three ways of calculating weighted averages in DuckDB.

Continue reading

A Quick Look at LIMIT & OFFSET in DuckDB

Most database management systems (DBMSs) provide us with a means of restricting the number of rows returned by a query to a fixed number of rows, or to a percentage of the data set. In many cases this is done with a LIMIT clause (although some DBMSs provide other methods, such as SQL Server’s TOP clause).

When it comes to DuckDB, the LIMIT clause is what’s implemented for this functionality.

Continue reading

Extract All Values From a JSON Document With DuckDB’s JSON_TRANSFORM() Function

The json_transform() function in DuckDB is a handy tool for converting JSON strings into structured data types like STRUCT, MAP, and LIST. This allows you to directly query and manipulate nested JSON data using standard SQL, making it much easier to work with complex JSON objects and arrays.

Think of it as a way to cast your JSON data into a more usable, typed format within your database.

Continue reading

Performance Tip for Extracting Multiple Values from JSON in DuckDB

DuckDB has a bunch of functions that allow us to extract data from JSON documents. For example, there’s the json_extract() function, which extracts JSON from the specified JSON document.

Often times we’ll need to extract multiple values within the same query. For example, we may need to extract both a user’s name and age, so that they’re returned in two separate columns.

Continue reading

Using Shorthand to Perform Data Conversions in DuckDB

When it comes to converting between data types, DuckDB performs implicit conversions when required, while also enabling us to perform explicit conversions. Implicit conversions are performed automatically by DuckDB when we do some other operation, such as use a function that requires its argument/s in a different data type than the one we’re providing. Regarding explicit conversions, we have the option of using a function like cast() or try_cast(), or using the shorthand method.

In this article we’ll take a quick look at how to convert between data types using the shorthand method.

Continue reading