In DuckDB, the json_structure()
is used to parse and introspect JSON data, returning the structure of a JSON document. This is especially useful when you’re working with semi-structured JSON data and you want to understand its schema — including nested keys, arrays, and types — without manually inspecting the raw JSON.
Category: DuckDB
3 Easy Ways to Calculate the Median Value from a List in DuckDB
While we might typically use DuckDB’s median()
function to get the median value from a column, when it comes to getting the median value from a list, we need to modify this slightly. A list can contain many values, and so we need a way to calculate the median of all of those values, rather than a single value like we’d normally be doing when calculating the median value from a column.
Fortunately, DuckDB provides us with several easy options for achieving this. Below are three options for getting the median value from a list in DuckDB.
Continue readingHow EPOCH_NS() Works in DuckDB
DuckDB provides an epoch_ns()
function that returns the Unix epoch time from a given date/time value in nanoseconds.
Unix epoch time is typically expressed as the number of seconds that have elapsed since January 1, 1970 (UTC), but it can also be expressed in other units, such as milliseconds, microseconds, and nanoseconds. DuckDB has specific functions for each of these units, with epoch_ns()
being the function that returns it in nanoseconds.
Understanding DuckDB’s JSON_VALID() Function
Most of the major RDBMSs that support JSON also include a json_valid()
function for checking the validity of a JSON document, and DuckDB’s no exception. Like the other RDBMSs, DuckDB’s implementation of json_valid()
checks its argument and returns a boolean
value to indicate whether or not it’s valid JSON.
3 DuckDB Functions that Prepend a Value to a List
DuckDB provides us with a few functions that prepend a value to a list. Actually, these functions are synonymous, so they all do the same thing, with the same/similar basic syntax.
Continue readingUsing LIST_INTERSECT() to Extract Overlapping Elements from Two Lists in DuckDB
The list_intersect()
can be quite a handy function when working with lists in DuckDB. It accepts two lists as arguments, and it returns a list of elements that are common to both lists. It essentially performs a set intersection operation while preserving the list data type.
Duplicate values are ignored, so only one value is returned even if there are multiple occurrences in one or both lists.
Continue readingAn Important Consideration When Choosing Between LIST_CONCAT() and || For Concatenating Lists in DuckDB
When it comes to concatenating lists in DuckDB, we can use the list_concat()
function (and its aliases), or the list concatenation operator (||
). Often when we have a choice like this, we can use either option interchangeably. But that isn’t quite true in this case.
There’s an important difference between list_concat()
and the ||
operator in DuckDB, and it all comes down to how it handles NULL values.
A Quick Look at DuckDB’s JSON_VALUE() Function
DuckDB provides a handful of functions for getting data from JSON documents. We can use them as long as the JSON extension is installed and loaded (which it is in most distributions). One such function for getting data from a JSON document is json_value()
. This function extracts scalar data from the specified path in the JSON document. If the value isn’t scalar, then a NULL value is returned.
Understanding JSON_EXTRACT_STRING() in DuckDB
DuckDB has a json_extract_string()
function that works similar to json_extract()
, except that it returns its result as a string (varchar
). The json_extract()
function, on the other hand, returns its result as JSON.
The purpose of these two functions is to extract data from a JSON document. We’ll focus on the json_extract_string()
function in this article.
Using JSON_EXTRACT() in DuckDB
DuckDB has a json_extract()
function that extracts JSON data from a JSON document. It enables us to get JSON values from within the JSON document, rather than returning the whole document itself. This article takes a quick look at the function along with some examples of usage.