Most DuckDB distributions are shipped with the json
extension, which enables us to work with JSON documents. DuckDB provides a bunch of JSON functions, including JSON_EXISTS()
. The JSON_EXISTS()
function allows us to check whether a specific path exists within a JSON document.
This article explores how this function works, along with examples.
Syntax
The basic syntax goes like this:
JSON_EXISTS(json_data, path_expression)
json_data
: The JSON document we want to query. This can be a JSON string or a column containing JSON data.path_expression
: A string that defines the path to look for within the JSON document.
Basic Example
Here’s a simple example that checks for the existence of a specified key in the JSON document:
SELECT JSON_EXISTS('{"name": "John", "age": 30}', '$.name');
Output:
true
In this case we checked if the key “name” exists at the root level, which it does.
Here’s another one:
SELECT JSON_EXISTS('{"name": "John", "age": 30}', '$.email');
Output:
false
In this query, we looked for “email”, which doesn’t exist, so the function returned false
.
Case-Sensitivity
JSON paths are case-sensitive, so be sure to use the right case:
SELECT
JSON_EXISTS('{"name": "John", "age": 30 }', '$.name') AS "$.name",
JSON_EXISTS('{"name": "John", "age": 30 }', '$.Name') AS "$.Name";
Output:
+--------+--------+
| $.name | $.Name |
+--------+--------+
| true | false |
+--------+--------+
Checking Nested Objects
We can use dot-notation to check for nested objects:
SET VARIABLE json_doc = '
{"user":
{"name": "John", "contact":
{"email": "[email protected]"}
}
}';
SELECT
JSON_EXISTS(getvariable('json_doc'), '$.user.contact.email') AS check_email,
JSON_EXISTS(getvariable('json_doc'), '$.user.contact.phone') AS check_phone;
Result:
+-------------+-------------+
| check_email | check_phone |
+-------------+-------------+
| true | false |
+-------------+-------------+
In this case the user has an email at the specified path but not a phone, and so the result of JSON_EXISTS()
reflects that.
Working with Arrays
We can also use JSON_EXISTS()
on JSON arrays:
SET VARIABLE json_doc = '{"users": ["Karra", "Jake", "Igor"]}';
SELECT
JSON_EXISTS(getvariable('json_doc'), '$.users[0]') AS "$.users[0]",
JSON_EXISTS(getvariable('json_doc'), '$.users[3]') AS "$.users[3]",
JSON_EXISTS(getvariable('json_doc'), '$.users[*]') AS "$.users[*]";
Result:
+------------+------------+--------------------+
| $.users[0] | $.users[3] | $.users[*] |
+------------+------------+--------------------+
| true | false | [true, true, true] |
+------------+------------+--------------------+
Checking for Any Element in an Array That Matches a Condition
Let’s say we have an array of user objects and want to check if any user has a specific property:
SET VARIABLE var = '[{"name": "Karra", "age": 20}, {"name": "Jake"}]';
SELECT
JSON_EXISTS(getvariable('var'), '$[*].age') AS "$.users[0]",
JSON_EXISTS(getvariable('var'), '$[*].verified') AS "$[*].verified";
Result:
+------------+---------------+
| $.users[0] | $[*].verified |
+------------+---------------+
| [true] | [] |
+------------+---------------+
The path that got a match returned [true]
whereas the path that didn’t returned []
.
Malformed JSON
The JSON document must be valid JSON. Passing invalid JSON results in an error. Here’s an example of passing a malformed JSON document:
SELECT JSON_EXISTS('{"name": "John", "age" }', '$.name');
Output:
Invalid Input Error:
Malformed JSON at byte 23 of input: unexpected character. Input: {"name": "John", "age" }
In this case the JSON document has an age
field, but there’s no value assigned to that field. This resulted in the error.
We can fix this issue by either adding a value to the age
field, or removing it altogether:
SELECT JSON_EXISTS('{"name": "John", "age": 30 }', '$.name');
Output:
true
Things to Remember
- Case sensitivity: JSON keys are case-sensitive, so
$.Name
and$.name
refer to different paths - Array indexing: Remember that array indices in DuckDB’s JSON path are zero-based
- Invalid JSON: Make sure your JSON is valid before querying with
JSON_EXISTS()